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CORRELATIONS AMONG CERTAIN MEASURES OF 
STUDENT ABILITY 


JOHN M. BREWER 


Harvard University 


This report is concerned with relationships existing among various 
measures of student ability. It indicates some tentative answers 
to such a question as: If a student does well in the required papers 
of a course what are the probabilities that his score in the course 
examination and his final mark for the course will be satisfactory? 

In a previous study! of examinations in the Graduate School 
of Education at Harvard University the author noted that the final 
grade in courses was ordinarily composed of the final examination, 
weighted about fifty per cent, and a number of other measures of 
ability used throughout the course. These other measures were 
listed chiefly as follows: Short examinations, thesis, assigned papers, 
reports on required reading, reports on visits, preparation of bibli- 
ographies, conferences, class discussion, and case reports. The present 
study was undertaken to discover statistical relationships which might 
exist among the numerical evaluations of these several measures of 
student ability in the work of the course. 

Are Other Measures Needed?—It may be asked at the outset whether 
or not these miscellaneous other measures of ability are needed. 
Will. not the final examination alone suffice? It is occasionally stated 
by proponents of examinations that they resemble closely life situations 
in which students are likely to find themselves. It is truly stated 
that the holder of a graduate degree in education ought to be able 
and must be able to answer systematically in writing questions similar 
to those proposed in a final examination, particularly such questions 
as involve reasoning, exposition, and application. Other teachers of 





1A Study of Examinations in Graduate Courses in Education. The Edu- 
cational Record, Vol. IX, No. 4, October, 1928, pp. 225-241. 
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education maintain that situations in the field require also important 
other abilities, such as the writing of a long report where one has 
access to books as he does not have in an ordinary examination, 
ability to maintain one’s viewpoint in group discussion, solution of 
cases, preparation of reading lists, the reproduction of such facts as 
may lie at the basis of a given educational policy, and the like. Con- 
sequently, it is maintained that the measurement of such abilities 
should be provided for in making up the final grade of a course in 
education, if, indeed, the course itself is supposed to be a preparation 
for actual work in the field. 

Relationships among the Several Measures.—Correlations are 
obtainable among some of these several abilities, and particularly 
between final examinations and each of the other important measures. 
If the final examination is weighted as constituting half the score 
of the final mark for the course, it is important for the sake of unifying 
the work of the course and the thinking of the individual student, 
to discover these relationships. 

This report presents data bearing upon these problems, drawn 
from four classes in the Graduate School of Education of Harvard 
University. 

Correlations in First Class Studied.—The chief measures of student 
ability used during the progress of the first class under consideration 
were the grades on a number of required expository papers. The 
scores in these papers, together with one or two minor measures, 
were combined (on a fifty-fifty basis) with the final examination to 
obtain the final score in the course. What, then, is the correlation 
among these three measures, the required papers, the final examination, 
and the final score? If a person does well in the required papers 
which he has handed in throughout the course, what is the likelihood 
that his final examination will be good and his final mark high? 
Following are the correlations obtained: 


IESE LE CE LP PE 48 

Required papers with examination................ r= .45 + .08 
Required papers with final score!................. r= .78 + .04 
Examination with final score...................0- r= .89 + .02 


In the view of the writer a clearer picture of correlation is obtained 
from the use of probability tables than from the coefficient of correla- 





1 The reader should note that this is a correlation of a part with the whole. 
One statistical authority states that such a correlation is probably one-third 
higher than it would be if the two items were independent of each other. 
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tion. The probability score involves thé same procedure as that of the 
scatter diagram, with three or more equal divisions in the two related 
categories, and with the figures in each row and column reduced to 
percentages. In our first table which follows, the class was divided 
into three equal parts in two different ways: On the basis of scores 
in required papers and on the basis of final scores in the course. The 
resulting numbers were then reduced horizontally and vertically to 
per cents, using the nearest whole number. 


PROBABILITY TABLE 
Required Papers and Final Scores 
The table reads as follows, across the first line: Of the lowest third of the class 
in scores on required papers, seventy-five per cent were in the lowest third in 


final scores, nineteen per cent were in the middle third, and six per cent in the 
highest third. 





| Final scores 
Required papers — Wo 
| 
| 








Low | Middle High 
Ne cu oie eu Sa duwatnas | 75 | 19 6 
Middle...................... | 19 56 25 
a oh a aa ine See | ti | 25 69 





It seems possible with the use of such a table for the student to 
reckon his probabilities in the final mark on the basis of his record in 
class papers. 

From the data of the same class we next present a four point prob- 
ability table showing the same relationship. 


Data IN Four Point PROBABILITY TABLE 


Here the class is divided by quartiles on the same two bases. Read hori- 
zontally as follows: Of the students in the lowest quarter in marks in required 
courses, eighty-three per cent were in the lowest quarter for the final scores, none 
were in the next quarter, seventeen per cent were in the second from the highest, 
and none in the highest. 





Final scores 








Required papers ae 











Low High 
Low | 83 0 17 0 
| 17 41 17 25 
| 0 41 | 33 25 
oa ae | 0 17 | 33 50 
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Correlations for Second Class Studied.—A second class, conducted 
in substantially the same way, yielded results not dissimilar, as follows: 


Se ee ee 33 

Required papers with examination................ r= .42 + .09 
Required papers with final score.................. r= .80 + .04 
Examination with final score...................-. r= .81 + .04 


The data drawn from this course are also thrown into probability 
tables as follows: 


REQUIRED PAPERS AND EXAMINATION 



































Examination 
Required papers --— 
Low Middle High 
at ae ts vie o eitira eu | 55 27 18 
Ee Poe are | 27 46 27 
CS Nia tik ual 4 arace nme ga pk | 18 27 55 
REQUIRED PAPERS AND FINAL Score 
Final score 
Required papers — 
Low Middle High 
is ct etre os ened ha 73 27 0 
RD. ictes oc ceeding eeepc eee 18 55 27 
aie sinha ant aban hrs WA id 9 18 73 














Correlations for Third Class.—In the third class studied correlations 
have been worked out between pairs of a number of different measures, 
with results as follows: 


Number of students in course.................... 45 
True-false test with final score................... r= .56 + .07 
True-false test on basic textbook with final exami- 

ae rN i ie a cae ec hide r= .31 + .09 
Required papers with examination. . ik ee whee fa ee ee 
True-false test and required papers with examination r= .61 + .06 
Required papers with final score. — ‘vcdee O@ Bt 
Required papers and above test with Gnas score. r= .89 + .02 
Examination with final score..................... r= .91 + .02 
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Following are five probability tables for course three, the first a 
five point table, and the others three point tables. 





































































































® 
1, TRUE-FALSE TEST AND Finau Score i 
Final score * 
True-false test on basic textbook — $$ ; | 
Low Middle High # 
eR sr yt re a ees 44 33 22 0 0 
22 22 22 11 0 
61 So Lo oe ed yee le 22 11 33 11 22 i 
11 22 22 33 11 . 
Benth ain ee lgnnda meee cae aed 0 11 0 33 56 is 
A 
2. TRUE-FALSE TEST AND FINAL SCORE 
& 
Final Score q 
Tru-false test on basic textbook r 
Low | Middle | High os 
— " 
ici tsa vs'n van eis elehegrewWh tiesecuxnegtl 53 47 0 f 
al ge, 3th Oe, Coumeheee Deans 6 a Sr | 33 27 40 4 
oe irene Mat Dies ci ce ne Sole lomtiad | 13 27 60 ; 
3. TRUE-FALSE TEST AND EXAMINATION 
| Examination 
True-false test on basic textbook i 
| Low Middle | High id 
Wee. ss ekpucecascicuen'anasdacdacevesevacesa | 53 | 20 27 
Re ici ctl alee iis ctl n Mine lh 27 | 40 33 ) 
ERED ECP eCenee rr | 20 | 40 40 
4. REQUIRED PAPERS AND EXAMINATION 
Examination 
Required papers — 
| Low | Middle | High 
EI a ar ot Re SRR note ee ary t PRE To © 53 27 20 ¢ 
EI SR ig ee re a Pe ee ee 40 40 20 4 
I ite ie, 8 a re a ig ee ee Ss clas eee 7 33 60 
d 
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5. REQUIRED PAPERS AND FINAL SCORE 





Final score 
Required papers = 
| 
| 








Low | Middle High 
“STEREO OREC ECC COL en Onn Fe | 73 2 | (0 
—— RRRRIIT NERS REESE re Ronn year | 27 | 67 | 7 
Seeley QEkE EEA ADERE ARRAS RIES | oO | 7 | 98 





Correlations Involving Cases, in the Fourth Class Studied—Much 
interest has been manifested in certain quarters in the use of cases 
for teaching the principles of education, and particularly in the teach- 
ing of vocational and educational guidance. This work has not 
proceeded very far as yet and conclusions are not permissible. How- 
ever, an attempt was made to measure the relationship existing between 
total scores made by students in four case reports and the score in 
certain questions of the final examination. This examination itself 
included a case to be solved, but gave also four questions requiring 
expository answers. The instructor was interested in finding out 
whether or not a close relationship existed between the marks obtained 
on case solutions and those of the old-fashioned expository answers 
of the examination book. The following were the correlations 
obtained: 


Number of students in class..................... 40 
Examination answers with final score............. r= .73 + .05 
Case reports with final score..................... r= .66 + .06 
Examination answers with case reports............ r= .17+.1 


The probability tables for these correlations, based on quartile 
divisions, and using tens instead of per cents, which the class of forty 
makes possible, are as follows: 
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1. EXAMINATION ANSWERS AND FINAL Score 


325 


Read this table as follows: Of the ten students who were in the lowest quarter 
in examination answers, six were in the lowest quarter in the final score, three 
in the second lowest quarter, one in the next to highest quarter, and none in the 


























highest. 
Final score 
Examination answers $$ 
| Low | High 
ae uy sien ssadunverevhiensastnesaneiie st Yes ¥ | 1 | Oo 
3 | 4 | 1 | 2 
i Pe a &. 
ae ee x wee 1 | 4 
2. Case REporTS AND FINAL ScoRE 
| Final score 
Case reports ree oo 
| Low | High 
Low. | 5 | 8 | 2 | oO 
| 2 | & | 8 | 0 
; er Be. ee 
a v6. dnn blo ear shee tewt ck bGe<eade~d ts | 0 | oy a | 6 
3. EXAMINATION ANSWERS AND Case REPORTS 
Case reports 
Examination answers — — 
Low | High 
ow 
BN 0c eee asa Kt dawhbiwdbiiedeaatawkwed i 4 1 
4 | 3 it iom 
3 | 1 2 | 4 
iii is -ohis dae 40 ad eee kbd thee nl~s sete 1 | 3 3 | 3 











Here the correlation figure as well as the probability table indicates 
clearly the negligible amount of correlation between these two factors. 
The positive diagonal of the probability square is distinctly heavier 
than the negative diagonal, but there is little consistent other 
relationship. 
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Concluding Observations.—It is within the bounds of possibility 
that before many years we may have statistical data sufficient to work 
out correlations between course grades and service in the field. Scien- 
tific follow-up investigations are necessary, if we are to obtain the 
necessary information to be used, on the one hand, to improve our 
courses, and on the other, to give such after-guidance in the field as 
shall make our work accomplish the objects for which it is intended. 
At present it is probably fallacious to believe that when people are 
taught the principles of education we can be assured that they will 
apply these. 

For the present, whatever may be the efficiency of our work as 
manifested by students in examinations, it may prove valuable to 
review and discuss among ourselves all the various measures of student 
ability which we now use. It is quite likely that we need to obtain a 
better estimate and appraisal of the various elements in the work of 
educational people, which would be useful to us in modifying the kind 
of measures we use and in apportioning relative values among them. 

In the matter of case reports, for instance, more and better sta- 
tistical information similar to that above given would enable us to tell 
whether the use of cases is as promising as it seems to be under sub- 
jective estimates. In the data obtained from the fourth class in the 
above study there may have been factors which interfered with the 
significant correlation which would normally exist. In any event, 
the figures obtained from the present study might not be borne out by 
other studies of a more intensive nature. 

It seems safe to assume, however, that the final examination by 
itself, even when its correlation with the final score is high, should not 
be the sole measure of student ability. 





A COMPARATIVE STUDY OF RECENT TEXTS 
IN PSYCHOLOGY, EDUCATIONAL PSYCHOLOGY, 
AND PRINCIPLES OF TEACHING 


HELEN FOSS WEEKS 
College of William and Mary 


H. D. PICKENS 
Superintendent of Schools, Oxford, Mississippi 


AND 


R. I. ROUDEBUSH 
Marshall College 


The study is an investigation of the nature and content of three 
subjects usually found in teacher-training courses; namely, psychology, 
educational psychology, and principles of secondary teaching. Pre- 
vious investigations indicate the need of more careful articulation of 
the subject-matter given in teacher-training courses. 

Several methods are being used to evaluate the different items 
that a teacher will need to know in order to be a successful worker. 
One of the most common ways is the questioning of the teachers in 
service as to what value certain items are in their own work. Another 
method of evaluation is to secure the opinions of experts in the field 
of education. A third method, the one used in this study, is the dis- 
covery of the emphasis given the different items by the writers of 
textbooks related to teacher training. 

In this study we have attempted to answer the following questions: 

1. How much overlapping of the major topics is there in psychology, 
educational psychology, and principles of teaching as indicated by 
the amount of space devoted to them in three textbooks in each of 
the three fields? 

2. How much space was devoted to actual school conditions as 
compared with the discussions and experiments not related to schools? 

3. To what extent has the emphasis on certain topics in educational 
psychology shifted during the last few years? 

A part of this study is a continuation of the one made by Goodwin 
B. Watson" and reported in this journal. He formulated a question- 
naire of fifteen major topics in educational psychology and determined 
their importance by subjecting them to four tests. First, the ques- 
tionnaire was submitted to a group of experts in the training of edu- 
cators on the faculty of teachers college. Second, it was rated by a 
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group of experienced teachers, supervisors and administrators who 
were in certain graduate courses in educational psychology. Third, a 
group of undergraduates who had never taught but were preparing to 
teach rated the fifteen topics. Fourth, three educational psychology 
textbooks were analyzed to discover the amount of space given to 
each of the fifteen topics. The books Watson used for this purpose 
were the educational psychologies of Starch, Gates and Strong. 

The fifteen points given in Watson’s study and used as a basis 
for this are as follows: 


Problems of original nature, heredity and environment. 
Problems of personality adjustment for teachers and pupils. 
The outstanding interests of children at different ages. 
Problems of general teaching method. 

Problems of teaching methods for special subjects. 
Problems involved in selecting curricula and texts. 
Problems in the development of skills. 

Problems of measurement. 

Individual and group differences. 

10. Problems involving extra-curricular activities. 

11. Problems of inter-relationships and transfer of training. 

12. Problems relating to the home as an educational institution. 
13. Problems involved in dealing with adults. 

14. Problems of the interaction of physical and psychological factors. 
15. Psychological ‘“‘schools”’ and theories. 


OPN WL 


Other studies of the nature of educational psychology and its over- 
lapping in the field of education have been made by Blue,! and Bolton.? 
Their findings have furnished additional evidence as to the need of 
defining the content of each of the subjects in the education curriculum. 
When the nature and limitations of each subject are fixed, then, and 
only then, will we be able to control the factor of overlapping. 


PROCEDURE | 


The writers surveyed the most recent texts in the fields of psy- 
chology, educational psychology and principles of secondary teaching, 
and with the assistance of the professors in each of these fields in the 
University of Michigan selected the following texts as outstanding: 


In psychology, Dashiel,‘ ‘‘Fundamentals of Objective Psychology,” Holling- 
worth,® ‘‘ Psychology—Its Facts and Principles,”’ Perrin and Klein,'® ‘‘ Psychology 
—Its Methods and Principles.”’ 

In educational psychology, Cameron,’ ‘Educational Psychology,’’ Jordan,’ 
“Educational Psychology,’’ Sandiford,'' “Educational Psychology.”’ 
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In principles of secondary teaching, Monroe,® ‘“‘ Directed Learning in the High 
School,’’ Morrison,' ‘“‘The Practice of Teaching in the Secondary School,”’ Storm- 
zand,}* “‘ Progressive Methods of Teaching.”’ 


The texts were read to find out how many pages were given to each 
of Watson’s fifteen topics. In order to provide for a check on the 
estimate of the number of pages given to a topic, each book was read 
by two members of the committee: those in the field of psychology by 
Pickens and Weeks, in educational psychology by Pickens and Roude- 
bush, in principles of teaching by Roudebush and Weeks. 

The material assigned to each topic was always classified as school 
or non-school. By school is meant materials developed in the school- 
room or closely related to schoolroom activities; such as application 
of tests to classroom studies, effect of health on school work, and pupil 
case study. By non-school is meant those materials, not connected 
with the regular schoolroom environment; such as, animal learning, 
college psychological laboratory experimentation, and adult learning. 

Since the number of words per page varied in the texts, for com- 
parison it was necessary to standardize the pages for all of the texts. 
This was done by counting the number of words on five average pages 
of each text and estimating the average number of words per page. 
A standardizing factor for each text was determined by taking as a 
standard the average page of the text (Cameron) which had the least 
number of words per page (269 words) and finding the ratio of the other 
average pages to it. The original counts were reduced to standard 
pages and expressed as per cents in Tables I and II. Materials for 
school and non-school are found in Tables IV and V. 

The per cent of space allotted each topic in the old educational 
psychologies (Starch, Gates, Strong) by Watson is given in Table 
VI, columns 1, 2 and 3, and the per cent for the new psychologies 
(Cameron, Jordan, Sandiford) by Weeks, Pickens and Roudebush in 
columns 4, 5 and 6; columns 7 and 8 are the means of the per cents of 
space used by the new and old educational psychologies for the topics; 
column 9 gives the difference between the new and old for each topic, 
with the old taken as the basis for comparison. — 


COMPARISON OF THE FIELDS TO DETERMINE OVERLAPPING 
DIFFERENCES IN THE FIELDS 


The striking thing about Fig. 1, in which the topics are arranged 
in descending order of emphasis in the educational psychologies, is 
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the three high points. These are due to the fact that the principles 
of teaching texts give, on the average, sixty-two per cent of their 
space to General Methods of Teaching and that the psychologies give 
fifty per cent to Psychological Schools and Theories and thirty per 
cent to Original Nature, Heredity and Environment. Further study 
reveals the fact that there are also very low points in these two subject 
fields, and that the educational psychology texts show no such out- 


standing differences in emphasis. The contrast in the amount of 
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Fic. 1.—Comparison of emphasis on the topics in three fields. From Table I. 


concentration in the three fields in further shown by comparing the 
per cent of space in each field which is given to its three major topics: 
in psychology, eighty-nine per cent of the space is given to its three 
major topics (Original Nature, Heredity and Environment, Interaction 
of Physical and Psychological Factors, and Psychological Schools and 
Theories); in principles of teaching, eighty per cent to its three major 
‘topics (General Teaching Method, Teaching Methods for Special 
Subjects, and Measurement); and in educational psychology, forty-five 
per cent to its three major topics (Teaching Methods for Special Sub- 
jects; Original Nature, Heredity and Environment; and Measurement). 
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It seems evident, then, that the content of the psychology and 
principles of teaching texts is concentrated on a few topics, while the 
educational psychology is broader in its treatment. 


SIMILARITIES IN THE FIELDS 


With these outstanding differences in mind, it is interesting to turn 
to the question of the similarities. A considerable amount of over- 
lapping is present. Of the topics under consideration, thirteen of 
which are discussed in at least one field, seven of them—over one-half— 


TaBLeE I.—PerR Cent or Eacu Fievip Given tro Eacu Topic 





Fields, per cent 











aye Psychology | Educational | Principles of 
~ | psychology teaching 

Ee re ee eee 30 14 3 
Personality........... ieiniares waves 2 4 5 
ee oral le Ie te Naa ws in Rey l .3 
General method..................... 3 10 62 
EG 2 cca wat ae ke ae Sone 2 18 10 
ceca Oa a es Nene Sikh .3 7 
CR Be as fa eR ee aad l 1 4 
Nn lrg ons 4 13 10 
i oe ie ee oa we Le 1 7 3 
Extra-curricular..................... | ne oe oe 
LS a4 4064 aoe vo o5 64 6005 Saws | 1 8 2 
EE, ks ea oe okie b bud bo eat ceed aw .3 
Adults...... Risse ant lta alanis i saca | 3 
Re ee ee 9 11 
ia i | 30 13 











are treated by all three fields. The minimum and maximum per 
cent of space given to each of these topics are: Measurement, 
four to thirteen per cent; Teaching Special Subjects, two to eighteen 
per cent; Transfer of Training, one to eight per cent; Individual 
Differences, one to seven per cent; Personality Adjustment, two to 
five per cent; General Teaching Method, one-third to sixty-two per 
cent; Original Nature, Heredity and Environment, one-third to thirty 
per cent. It will be noted that for most of these topics the per cent 
of space given by one field is small, and that there is considerable 
emphasis by another. \ 
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There is more than fifty per cent overlapping in the three fields, 
so far as the number of topics is concerned. The overlapping in con- 
tent cannot be determined directly from the tables, but the great 
difference in emphasis given by the three fields to the several topics 
suggests a lack of great duplication in content. 

A comparison topic by topic shows that the greatest overlapping 
is by educational psychology. It includes twelve topics, while each 
of the others includes only eleven. Moreover, it has two topics in 
common with principles of teaching which are not touched by psy- 
chology, and two in common with psychology which are not touched 
by principles of teaching. There are no topics occurring only in 
psychology and principles of teaching. 

These facts call for a further comparison of educational psychology 
with the other fields. Of the topics discussed in psychology, educa- 
tional psychology gives greater emphasis than does psychology to 
all but two—Psychological Schools and Theories, and Original Nature, 
Heredity and Environment; and of the topics discussed in principles 
of teaching, educational psychology gives greater emphasis than does 
principles of teaching to all but four—General Teaching Method, 
Personality Adjustment, Selecting Curricula and Texts, and Develop- 
ment of Skills. While it gives no space to two topics—The Home as 
Educational Institution, and Dealing with Adults—which receive 
minor treatment in at least one of the other fields, it surpasses both 
fields in the per cent of space given to the following six topics: Inter- 
action of Physical and Psychological Factors, Teaching Methods for 
Special Subjects, Transfer of Training, Individual and Group Differ- 
ences, Interests, and Measurement. 

These facts justify the statement that educational psychology 
' does not merely duplicate the work of other fields, but also occupies 
ground which receives little emphasis from the other two. 


DIFFERENCES BETWEEN TEXTS WITHIN EAcH FIELD 


The preceding statements are based on the means of the per cents 
of space given to each topic by the three books in each field. A 
comparison of individual texts within each field by a study of Table 
II makes apparent a striking difference of opinion between authors, 
not only as to the topics which should be discussed, but as to the 
emphasis which should be given them. The books vary greatly 
in length, as is seen by Fig. 2, and it is conceivable that the briefer books 
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Fic. 2.—Number of standard pages in each text. 


1. Dashiell. 
2. Hollingworth. 
3. Perrin and Klein. 


4. Cameron. 
5. Jordan. 
6. Sandiford. 


7. Monroe. 
8. Morrison. 
9. Stormzand. 
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TaBLeE I].—Tue Per Cent or Eacu Book GIVEN TO EACH OF THE FIFTEEN TOPIcs 
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could not discuss adequately as many topics as the longer ones. 
But a moment’s study of Table II brings out the fact that the topics 
omitted or given little space by the shorter books—Perrin and Klein, 
Cameron, and Stormzand—are in several instances not the topics 
which are given least emphasis by the other books in the same field; 
and the page count (not printed here) shows that some topics are given 
more pages in the short books than in the long ones. §o tere is not 
only apparently but really a difference of opinion as to content and 
emphasis. 

In educational psychology the differences between the texts are 
greater than the differences in either of the other fields. Jordan and 
Sandiford agree in omitting topics 7, 10, 12, 13, but have diametrically 
opposed ideas as to the emphasis which should be given to the remain- 
ing topics. Of these, the four to which Jordan gives most space are 
among the five to which Sandiford gives least ; and the four highest ones 
in Sandiford are among the six lowest in Jordan. This contrast is 
clearly shown in Table III. 


TaBLE III.—Contrast or EmpHasis ON CERTAIN TOPICS IN THE TEXTS OF 
JORDAN AND SANDIFORD 














Per cent of space 
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Jordan Sandiford 
SS ic 19 1 
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Eh ns cub accdekaawnhanee von 5 20 
a er ea ke 2 25 
EEE PRO SIRO Ie mn ee ee one eer ears | 21 82 











Jordan’s interest is in application, and Sandiford’s in theory. 
Cameron takes a middle ground between Sandiford and Jordan, 
surpassing both other texts on only one topic—Methods of Teaching 
Special Subjects—to which he gives 33 per cent of his book. 
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In the field of principles of teaching there are some interesting 
differences between texts, but none so outstanding as have been pointed 
out in educational psychology. Except that Stormzand fails to give 
any discussion of Personality Adjustment, or of Transfer, it will be 
noted that his test and Monroe’s are in rather close agreement. Morri- 
son differs from both by giving 28 per cent of his space to a topic which 
they omit—Methods of Teaching Special Subjects—and by omitting 
three topics which they discuss—Curricula and Texts, Development 
of Skills, and Individual Differences. 

The variations between the texts in psychology are less than in 
either of the other fields, though still considerable. 

Since the comparison previously made of the fields of psychology, 
educational psychology and principles of teaching were based upon 
averages obtained from texts which differ so widely as this, it would 
be natural to think that the conclusions drawn from these comparisons 
are not really typical of the fields; that the conclusions from a wider 
sampling of tests in these fields might be quite different; and, therefore, 
that these conclusions are of slight significance. However, a little 
arithmetic will show that if three more books in each field were used, 
and if all of them took any one of the extreme positions found in any 
one of the texts studied, the change in the averages for the fields would 
not be great enough to invalidate any one of the conclusions stated. 
Consequently it is fair to say that the conclusions do represent char- 
acteristic differences and similarities among the fields. 


TaBLE IV.—STANDARD Paces AND PER CENT oF SCHOOL AND NON-SCHOOL 
MATERIALS PER AUTHOR 
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Perrin and Klein | l oe: i «fs 99 
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Principles of | Morrison 665 99 plus 
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COMPARISON OF SCHOOL AND NON-SCHOOL EMPHASIS 


A comparison of the space given to school and non-school materials 
as presented in Tables IV and V, and Fig. 3, gives further evidence 
of the striking differences between the three fields. Principles of 
teaching is concerned only with the school: all three texts, in spite of 
some differences in topics selected, give one hundred per cent attention 
to the school. Psychology is very nearly as single-minded, but con- 
centrates on the other side, giving ninety-nine per cent of its space 
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Fic. 3.——Per cent of space given to school and to non-school materials. From 
Table IV. Shaded portions represent school material, and unshaded portions non-school 
material. 


1. Dashiell. 4. Cameron. 7. Monroe. 
2. Hollingworth 5. Jordan. 8. Morrison. 
3. Perrin and Klein. 6. Sandiford. 9. Stormzand. 


to a non-school presentation of its topics. There is no such unanimity 
of opinion among the writers on educational psychology. The fraction 
of space given to topics from the school point of view varies from about 
one seventh to about four sevenths—Sandiford, thirteen per cent; 
Cameron, thirty-five per cent; and Jordan, fifty-seven per cent. 

This contrast might have been anticipated from the inference 
drawn in a preceding paragraph that Jordan’s interest is in application 
and Sandiford’s in theory. It will be seen by a reference to Table V, 
that these interests have influenced not only the topics selected, but 
also the treatment of each topic; Jordan is looking toward the school 
but Sandiford sees it only out of the corner of his eye—his attention 
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TaBLeE V.—STANDARD PaGEs OF SCHOOL AND NON-SCHOOL MATERIALS PER Topic 
FoR Eacu AUTHOR 
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is elsewhere. 


A comparison of two topics illustrates this well: for 


Measurement, the ratio of school to non-school in Jordan is about 3 
to 1, in Sandiford 1 to 214; for Transfer the approximate ratio of school 
to non-school in Jordan is 70 to 1, in Sandiford, 1 to 244. 

There is obviously a substantial agreement as to the appropriate 
content for the courses in psychology and in principles of teaching: 
in the former, a non-school treatment of all topics; in the latter, a 
Educational psychology seems to be 
a sort of no-man’s-land: there is no such agreement as to approach— 
whether it shall be from the point of view of school or of non-school 


school treatment of all topics. 


interests. 


COMPARISON OF THE OLD AND New Texts TO SHOW THE TREND IN 
EDUCATIONAL PsycHOLOGY 


This difference of opinion among authors as to the emphasis which 
should be given in a text on educational psychology to the school, 
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and the previously noted difference of opinion as to the emphasis 
which should be given the various topics point to a state of flux in the 
field which raises the question of the present trend. 

That there is a considerable difference of emphasis in the old and 
new texts in educational psychology stands out clearly in Fig. 4. 
The greatest disagreements are on topic 15—Psychological ‘‘Schools’”’ 
and Theories—on which the newer texts exceed the older by eleven 
per cent of space; and topic 1—Original Nature, Heredity and Environ- 
ment—on which the newer exceed the older by six per cent of space. 








—— Newlexty 
------ Old Texty 











Fic. 4.—Comparison of space given to the topics by old and new educational psycho- 
logies. From Table VI, columns 7 and 8 Order of topics is based on descending 
emphasis in new texts. 


On the other hand, the older exceed the newer by five per cent on 
Selecting Curricula and Texts, and on Development of Skills. These 
more striking differences indicate a greater emphasis on the part of the 
older psychologies on the practical and of the newer psychologies 
on the theoretical. 

This conclusion is borne out by a consideration of some other 
differences in the two groups of texts; to wit, three topics of practical 
application—Extra-curricular Activities, The Home as an Educational 
Institution, and Dealing with Adults—which are given brief treatment 
in the older texts are not treated at all in the newer. 
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TABLE VI.—CoMPARISON OF SPACE IN THE NEW AND OLD EpvucATIONAL PsycHoL- 


OGIES ALLOTTED Eacu Topic 
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Several further points of difference might well be discussed: e.g., 
Methods of Teaching Special Subjects is treated at greater length 
by the newer texts, and Individual Differences by the older ones. 
But as such topics might lend themselves easily to either a practical 
or a theoretical treatment, and as Watson did not determine the divi- 
sion of material under the various topics on the basis of school and 
non-school, it is impossible to pursue this question to a conclusive 
answer. Such evidence as is clearly presented by the available facts 


seems to show a present trend in educational psychology toward the 
theoretical. 


SUMMARY 


I. Overlapping. 
1. There is overlapping of all fields on the selection of topics to 
the extent of more than fifty per cent. 


2. There is great variation in the per cent of space given to the 
common topics. 
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3. Educational psychology overlaps the other fields more than they 
overlap each other. 


4. Educational psychology lacks the extreme specialization of 
the other fields. 


II. Comparison of School and Non-school Materials. 

1. The psychologies devote ninety-nine per cent of their space to 
non-school material; the educational psychologies vary from forty- 
three per cent to eighty-seven per cent given to non-school material; 


and the principles of teaching give no non-school materials—they are 
one hundred per cent school. 


III. Comparison of Old and New Psychologies. 


-1. The new psychologies give a larger per cent of space to the 
theoretical topics than do the older ones, and so far as can be deter- 
mined a less practical emphasis to other topics. 
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THE CHANGE IN INTELLIGENCE QUOTIENTS IN 
BEHAVIOR PROBLEM CHILDREN!’ 


ANDREW W. BROWN 


Institute for Juvenile Research, Chicago 


I. Statement of Problem.—The question of the amount of change in 
the intelligence quotient from one examination to another is one that 
has received a great deal of attention, and the results of a number of 
studies have been reported. For the most part, these studies have 
been made on normal children. Few studies have been made of the 
reliability of these intelligence ratings for children presenting behavior 
problems. Truancy from home, truancy from school, lying, stealing, 
temper tantrums, and other behavior deviations are frequently 
symptoms of conflict either in the child’s own mind or with others in 
the home environment, and the question often arises as to the result 
of this conflict on his rating on the intelligence tests. This study is 
made in an attempt to throw some light on this problem. The purpose 
of the study is three-fold: (1) To determine the amount of variation 
in behavior problem children; (2) to compare this with the fluctuation 
in normal children; (3) to enumerate some of the conditions of large 
variation. 

A very good review of previous studies of this problem has been 
made by T. G. Foran and presented in a recent issue of the Educational 
Research Bulletin of the Catholic University of America. In general, 
it has been found that in normal children the average amount of 
change from the first to the second examination is about five points 
when all causes of variation are uncontrolled. Only about twenty 
per cent of the deviations exceeds ten points. The correlations 
between successive tests at varying intervals of time range from .80 
to .95. 

II. Conditions of This Study.—The present study deals with the 
cases that have been re-examined at the Institute for Juvenile Research 
between 1920 and 1928. There were 707 children who had been 
given two or more Stanford-Binet ratings. This does not include 
those whose rating on the first examination, because of lack of coopera- 





1 Read at the meeting of the Orthopsychiatric Association February 22 and 23, 
1929, New York City. . This is one of a series of studies from the Institute for 
Juvenile Research, Herman M. Adler, M.D., Director, Series C No. 155. The 
writer wishes to express his gratitude to Miss Dorothy Coors for the tabulation 
of the data from Institute records. 
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tion or some other reason, was questioned by the examiner. The 
tests were given by various people, all of whom were experienced 
examiners. In comparatively few cases, however, were the first and 
subsequent tests given by the same person. 

The reasons for referring these cases to the clinic were tabulated 
for each of the 707 cases and as far as could be observed, these do not 
differ from those for which the other children were referred. 

The chronological ages ranged from two to eighteen years. The 
average was 10.5 years. About sixty-seven per cent of the cases were 
between seven and fourteen years of age. 

Table I gives the distribution of 1Q’s on the first examination. 


TaBLeE I.—Tue DistrisvuTion or IQ’s ON THE First EXAMINATION (N = 707) 


RaNGE FREQUENCY RANGE FREQUENCY 
125-129 2 70-74 92 
120-124 3 65-69 71 
115-119 3 60-64 48 
110-114 4 55-59 32 
105-109 18 50-54 20 
100-104 30 45-49 3 

95-99 42 40-44 8 

90-94 62 35-39 3 

85-89 74 30-34 5 

80-84 87 25-29 1 

75-79 97 20-24 0 

15-19 1 


III. Relationship between the Various Examinations, Regardless 
of the Time Factor—The IQ’s on the first examination range from 
twenty to one hundred thirty with a mean at 78.74 and a standard 
deviation of 15.7. The means and standard deviations are shown 
in Table II. The mean of the second examination is 79.1 and the 
standard deviation 16.7. The time between the first and second 
examination ranges from a few weeks to over four years with an average 
time of about fifteen months. The correlation between the first and 
second examination for the 707 cases, disregarding the time between 
them, is .88 + .006. 

The correlation between the second and third for one hundred 
forty-nine cases, regardless of the time between them, is .87 + .013. 
This correlation approximates very closely that between the first and 
second. The average time between the second and third is about 


fourteen months which is about the same as that between the first and 
second. 
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The correlation between the first and third, regardless of the time 
between them, is for one hundred forty-nine cases .70 + .028. 
The difference of eighteen points between this, and the correlations 
between the first and second and second and third is difficult to explain. 
One might suspect that it might be due to the fact that the time 
interval between the first and third is the sum of the time intervals 
between the first and second and second and third and that the longer 
the time interval the greater the amount of change. But from results 
shown in Table III, the time between the tests does not seem to be 
an important factor in producing change in the ratings. 

This correspondence between the first and second tests for behavior 
problem children with the exception of the .70, agrees closely with those 
found by others on the normal child. Terman reports a correlation 
of .93, H. O. Rugg and Cecile Colloton a correlation of .84, Garrison a 
correlation of .88 and L. 8. Rugg, a correlation of .95. 


TaBLE II.—RELATIONS BETWEEN DIFFERENT EXAMINATIONS REGARDLESS OF 








TIME 
Symbol Between first and | Between first and | Between second and 
second third third 
Wise snsienke vs 706 149 149 
Rigs 0.88 + .006 0.70 + .028 0.87+ .013 
— Serr rrr 78.74 81.59 
M:. a 6—6C—(“(“‘ié‘:;é‘*@iSS:*C cw 82.33 
_ errs ee rN Pere 82.16 ~ 82.16 
o} 15.7 13.13 
o2. | Jn eee corer 14.39 
Miitcin wise erences: deen teens 14.98 14.98 














IV. Amount of Change at Different Time Intervals.—The influence 
of different intervals of time between the first and second test is shown 
in Table III. For two hundred twenty-one cases, with a time 
interval of one year or less between the examinations, the correlation 
is 91 + .007. With a time interval of two years and less than three 
between the tests, the correlation is .87 + .009, for three hundred 
twenty cases; with three years and less than four for ninety-nine cases 
the correlation is .88 + .015, and with four years and less than five 
the correlation is .87 + .02 for forty-one cases. These correlations 
seem to indicate that the time factor alone, at least up to a four-year 
period, has little influence on the amount of variation. This again is in 
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fair agreement with the results of other studies. 


The Journal of Educational Psychology 





Garrison gives a 


correlation of .88 at one year interval, a correlation of .91 at a two- 
year interval and a correlation of .83 at a four-year interval. 


TABLE III.—RELATION BETWEEN FIRST AND SECOND EXAMINATION WITH DIFFER- 


ENT TIME INTERVALS BETWEEN THEM 

















Interval of | One year and | Two years and anaes pants 
Symbol and less than 

one year or | less than two | less than three four 

| 221 320 99 41 

ae S teccat ce Zaae sale 0.91+.00 0.87+ .00 0.88 + .01 0.87 + .02 

M;.. 77.9 79.11 78.9 81.03 

M>.. 79.6 80.23 76.8 76.64 

o1. 15.12 15.39 16.17 17.08 

Be asi tnsik ee accel aca 16.3 16.43 17.27 17.2 





V. The Amount of Change at Different Levels of Intelligence.— 
The average of the amount of change between the first and second 
examination for children at different levels of intelligence has been 
found to be about the same for normal children. For one hundred 
eighty-three children of 110 IQ and above, Terman found an average 
difference of 5.8 points; Garrison found a difference of 5.6 points 
for those of 110 IQ and above for twenty-six cases; and Rugg and 
Colloton a difference of 4.6 for ninety-seven cases with IQ of 110 
above. For children of 90 to 109 IQ the average change reported in 
these three studies is 6.2, 4.0 and 4.7 respectively. Terman reports 
a difference of 5.8 for one hundred four children below 90 IQ. These 
would seem to indicate that for normal children there is no signif- 
icant difference in the amount of variation at different levels of 
intelligence. 

With children who present behavior problems, the results seem to 
be somewhat different. The correlations seem to warrant the general 
conclusion that the brighter children are likely to be more variable. 
The data are presented in Table IV. For eighty-three cases of 60 IQ 
and below, the correlation between the first and second test is .81 + 
.026. The mean IQ for this group in the first and second examination 
is 52.02 and 53.7 respectively. The sigmas are 9.3 and 12.41 respec- 
tively. For four hundred seventy-five cases from 61-90 IQ on the 
first test the correlation is .68 + .017. The means are 76.8 and 


77.3 respectively; the sigmas are 8.06 and 10.75. For one hundred 
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forty-eight cases with IQ of 91 and above, the correlation between 
the first and second is .61 + .03. The means are 99.9 and 98.9 and 
the sigmas are 7.6 and 10.42. These different correlations have not 
been corrected for difference in variability, and it may be that this 
difference would explain the difference in correlation. 

These correlations indicate that the ratings on the feebleminded 
cases are more reliable that the ratings on those with the higher IQ’s 
and that with behavior problem children the amount of change from 
one examination to another increases with increase in the intelligence 
rating. Dr. Poull in a study of defectives found that the average 
amount of change was less than that for children of average intelligence, 
and she concludes that ‘‘defectives are not more variable than normal 
subjects.” This study of behavior problem children would seem to 
indicate that they are less variable. 


TABLE IV.—RELATION BETWEEN FIRST AND SECOND EXAMINATION AT DIFFERENT 
LEVELS OF INTELLIGENCE 














Symbol Below 60 IQ 60-90 IQ 90 IQ and above 
hy ae ae eer 83 | 475 148 
Ti2.. 0.81 + .026 0.68 + .017 0.61+ .03 
M;,.. 52.02 76.81 99.9 
M>.. 53.7 77.33 98.9 
ON i a gg eat 9.3 8.06 7.6 
Pe eee, ee 12.41 10.75 10.42 














VI. Changes Due to Difference in Sex.—There is no significant 
difference in the amount of change from one test to another due to sex 
alone. The amount of change for the boys is about the same as that 
for girls. For four hundred fifty-eight boys who were re-examined the 


TaBLE V.—RELATIONSHIP BETWEEN DIFFERENT EXAMINATIONS FOR Boys AND 











GIRLS 

Boys Girls 
PRE etre NE Le me ed 458 248 
hs oh xo ds Seca ceed 0.88 + .007 0.87+.01 
ns ul es Oe ee 79.48 77.36 
Be ve 8s ota gant akd cee 80.068 77.32 
WR ses icie cx so be a ecate 15.395 16.15 
o2. 16.81 16.37 
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correlation between the first and second test is .88 + .007; and for two 
hundred forty-eight girls the correlation is .87 + .01. The means 
and the sigmas are also about the same. The data are presented in 
Table V. 

VII. Changes at Different Age Levels—Table VI gives the average 
of the positive and negative change and the average of the total change 
at the different year levels. It will be observed that the greatest 


TABLE VI.—TAaBLE SHOWING VARIATIONS AT Eacu YEAR LEVEL 





























Number of Average Average Average 
Age re ; 

cases positive negative | total 
2 10 4.3 1.9 | 6.2 
3 19 7.9 3.9 11.8 
4 23 8.6 1.6 10.1 
5 42 4.5 2.7 7.2 
6 48 4.1 2.6 6.8 
7 56 3.0 3.3 6.3 
8 75 2.0 2.8 4.8 
9 67 2.2 3.1 5.4 
10 58 1.7 3.4 §.1 
11 66 2.2 2.6 4.8 
12 74 2.6 2.3 4.9 
13 69 2.3 2.8 5.1 
14 42 2.8 2.5 5.3 
15 26 3.7 1.9 5.6 
16 17 4.6 0.8 5.4 
17 8 6.1 0 6.1 
18 2 7.0 0 7.0 

Total number of cases Average total for all age levels 
702 5.79 





TaBLeE VII.—TasiLe SHOWING VARIATIONS AT DIFFERENT YEAR LEVELS 








Age Number of; Average Average Average 
cases positive | negative total 
ere Cree 84 6.36 2.69 9.05 
Sf Re 6 ee 179 2.90 2.88 5.78 
oe Ee ere ee 191 2.06 3.01 5.07 
4 Be) 0 re 185 2.55 2.54 5.08 
Se ee es BD. cca cacwa ace 53 4.45 1.19 5.64 
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variation from one examination to another takes place at the extremes 
of the chronological age scale. For the ages from three to five the 
average change is 9.7 points (see Table VII) and for the ages fifteen to 
eighteen the average change is 6.02 points, while for the years from 
six to fifteen the average amount of change is about five points. This 
tendency for the extremes of the CA group to show the greatest 
amount of change can not be due to the fact that the scale is too easy 
at the lower ranges and too difficult at the upper ranges, for if this 
were so, other things being equal, the negative change at the selevels 
would be greater than the positive change and the reverse is true. 
It is, however, the changes at the extremes which have the greatest 
influence in lowering the correlation. 

These results with behavior problem children agree very closely 
with those reported in other studies. Terman reports an average 
variation of 6.9 points for ages three to six; a variation of 6.0 points 
for ages six to nine; a variation of 5.3 for ages nine to twelve; and a 
variation of 6.3 for ages twelve and above. Other studies have been 
reported but the number of cases at some of the age levels is so few 
that a comparison of one age group with another is unreliable. 


TasBLeE VIII.—SxHowina THE DISTRIBUTION OF THE AMOUNT OF CHANGE 











Amount of change Number of cases | Per cent 

0-5 416 | 58.9 

6-10 182 25.8 
11-15 73 10.3 
16-20 16 2.2 
21-25 12 Lg 
26-30 7 9 
30 and above 1 m 

RE aR neta ayn rae 707 











VIII. Distribution of the Amount of Change.—The distribution of 
the amount of change is shown graphically in Fig. 1. Four hundred 
sixteen of the 707 cases or 58.9 per cent change five points or less; 
598 cases or 84.7 per cent change eleven points or less; 671 cases or 
95 per cent change sixteen points or less; 687 cases or 97.2 per cent 
change less than twenty-one points and nineteen or 2.7 per cent change 
twenty-one points or more. Or to state this in another way, the 
chances are about forty in one hundred that the change from the first 
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examination to the second will be greater than five points; the chances 
are about fifteen in one hundred that the change will be greater than 
ten points; and about five in one hundred that it will be greater than 
fifteen points; and about two in one hundred that it will be greater 
than twenty points. 

Although the correlation between the rating on one test and another 
on the Stanford-Binet is high, a large number of cases make consider- 
able change, and from the clinical point of view these are often the 
important cases. One hundred eight cases or 15.2 per cent change 
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FIGUREI. SHOWING THE DISTRIBUTION OF THE AMOUNT OF CHANGE 


FROM FIRST TO SECOND EXAMINATION 


eleven points or more. To say that the average change is about 
five points does not help a great deal, because in dealing with clinical 
cases one can never be sure that the particular case under observation 
may not be one that will show a large amount of change. It would 
seem advisable therefore to secure at least two ratings wherever an 
intelligence rating is especially important in disposing of the case or 
in making recommendation. 

IX. Some Possible Causes for Variations in IQ.—The records of 
all the cases with a change of more than twelve points were studied, 
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in order to determine if possible some of the important conditions of 
change in ratings. Those which showed increase and those which 
showed decrease were studied separately. In listing these conditions 
it is not assumed that they are causes of change in all children. It 
might well be that these same conditions may prevail in other cases 
where there is no change in rating. A study of a large number of 
cases presenting the behavior problem in question would need to be 
made in order to determine the extent to which that difficulty was a 
cause of change. These are only mentioned as possible causes. 


(a) Six of the thirty-six cases with a decrease of twelve points or more had 
mention of encephalitis either in the physical examination or in the social history. 
The case with the largest amount of decrease with a drop from ninety-eight to 
seventy-five was such a case. 

Three of the thirty-six cases had mention of convulsive attacks. Three of the 
thirty-six had mention of lues or antiluetic treatment. One had received a head 
injury being hit by a truck at which time he was unconscious for several hours, and 
in one other the psychiatric report stated that it was unusual to find a “psychosis 
in a child so young.” It is significant that ali of these conditions are those which 
effect the central nervous system. 

(b) Another possible cause of change is change in the social environment. 
This factor may be correlated with either a decrease or an increase. If the child 
remains in a poor social environment or is changed to a poor social environment, 
the rating may decrease, especially if the first test is given at an early age. Baldwin 
in a recent study reported that there was a greater difference between rural and 
urban children at the later ages than at the earlier ages. This is probably due to 
the fact that the older rural children have not had an opportunity to learn those 
items which the test demanded. The same is true of those cases who remain in a 
poor social environment in an urban population. 

(c) A third possible cause of change is the difference in the reliability of the 
scale at different year levels. Thurstone has shown that the items of the test 
are not properly scaled. That, for example, some of the tests of year seven are 
easier than those at year six and so on and that the degree of difficulty between 
the different subtests at different year levels is not the same. This would operate 
to produce larger changes at some year levels than others. 

(d) A number of large differences seem to be due to failure of the child to 
cooperate on one or other of the examinations, and where the examiner failed to 
question the rating, the change of thirty-eight points given in Table VII was such 
acase. (This case with the change of thirty-eight points was one of those included 
in the correlation of .70 given in Table II.) 

(e) It also seems possible, although it is difficult to get objective evidence, that 
some large changes are due to difference in the personality of the examiner, not 
difference in method of giving or scoring, but due to the fact that the child responds 
more freely to some individuals than to others. 

(f) In the cases at the Institute a number of increases seemed to be due to 
increase in facility with the English language. 
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(g) There were several cases where improvement in behavior as reported in the 
social history was accompanied with marked increase in intelligence ratings. 


Conclusions. 

1. The amount of change from one examination to another, when 
the change of a large number of cases is averaged, is small (about 5.8 
points). ee 

2. There are, however, large variations in individual cases which 
indicate the necessity of giving more_than-one-rating if an accurate 
measure of the child is to be secured. 

3. The fluctuations of the ratings of behavior problem children 
are little greater on the average than those of the so-called normal 

child. 
- 4, There is no significant difference on the average between the 
fluctuations of boys and girls. 

5. The length of time at least up to a_four-year-interval.does not 
significantly affect the amount of change in the ratings. 

6. There is less-change in the ratings of the feebleminded than there 
is in the rating of the child of average intelligence who presents 
behavior problems. 

7. The case records of those making larger changes than twelve 
points were studied and the significant factors that accompany the 
change have been listed. 
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THE RELIABILITY OF THE ACHIEVEMENT 
QUOTIENT 


CLYDE A. MORLEY 


University of Wisconsin 


The achievement quotient is such a simple and plausible device 
for measuring pupil accomplishment or pupil effort that it has been 
widely used. Serious doubts have arisen as to its worth on the basis 
of validity and reliability. The arguments concerning validity are 
mostly theoretical, and the criticisms against the reliability are sup- 
ported by meager experimental evidence. If the achievement quo- 
tient is too unreliable for practical school [purposes it should be 
discarded. If it is reliable under certain circumstances then such 
conditions should be recognized and controlled. 

This investigation was attempted with the hope that the conditions 
or factors affecting the reliability of the achievement quotient could 
be discovered and controlled. The problem resolved itself into: 
(1) The determination of the degree of reliability required of educa- 
tional tests and intelligence tests, to secure achievement quotients 
sufficiently reliable for practical school purposes. (2). The identifica- 
tion of other factors affecting the reliability of the achievement 
quotient. (3). The recognition of conditions to be met before the 
achievement quotient can be used satisfactorily. 


DEFINITION OF RELIABILITY 


Reliability in this study, unless otherwise stated, will refer to the 
correlation between scores made on two comparable forms of a test. 
All correlation coefficients were obtained by the formula:' 

Mab — (Ma X Mb) 


r= ——— ——$————— 


4/ Ma? — (Ma)*\/ Mb? — (Mb): 





where 
r = the correlation coefficient. 
M = the mean of the column indicated by the subscript. - 
a = scores on form A of the test. 
b = scores on form B of the test. 
a? = the squares of the scores on form A of the test. 
b? = the squares of the scores on form B of the test. 
ab = the products of the scores on form A and the scores on form B. 


ADMINISTRATION OF TESTS 


Forms A and B of the Stanford Achievement Reading, Arithmetic, 
and Spelling tests, and forms A and B of the Otis Self-administering 
351 
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Tests of Mental Ability were administered to all pupils of Grade VIA 
in the Racine Public Schools. Form A of each test was given on May 
8, 1928. Form B of each test followed one week later. 


DERIVED SCORES 


Two mental ages and six pairs of subject ages were computed 
for each pupil on the basis of the two forms of each type of test used. 
From these measures two sets of achievement quotients were obtained 
for each pupil in reading, arithmetic, and spelling. The first set of 
achievement quotients was found by dividing the subject age, obtained 
on form A of the achievement test, by the mental age obtained on 


TABLE I 


Reliability Coefficients, Obtained by Correlating the Scores Made by 381 
Pupils of Grade VIA in the Public Schools of Racine, Wisconsin on Two Forms 
of the Respective Tests, for the Otis Self-administering Tests of Mental Ability; 
the Stanford Achievement Testsin Reading, Arithmetic, and Spelling; and the 
Obtained Achievement Quotients, Together with the Correlations between 
Mental Ages and Subject Ages Computed on the Same Tests 
































Reliability of intelligence test! = 85+ .01 | .92+.005 

Stanford achievement tests Tab Pus (AQ)ras (AQ)ra 

1 2 3 4 5 

REE RSPER RTE Sper eae ge ee .93+.005| .49+.05 | .74+.016) .96+ .003 

ee .87+ .008) .74+.03 | .54+.025) .68+ .02 
Reading: Word meaning......... .83+ .012) .66+ .036| .55+ .036, .71+.018 
Arithmetic: Total.............:. .80+ .013) .56+ .045) .63+.021| .73+.016 
Arithmetic: Computation........ .75+ .015) .45+.05 | .68+.021) .71+.018 
Reading: Paragraph meaning....| .75+.015) .70+.032) .37+.03 | .49+ .026 
Reading: Sentence meaning...... | .73+.016| .67+ .035|) .46+ .028) .52+ .026 
Arithmetic: Reasoning.......... | .70+.018) .58+.04 | .50+.027| .59+ .022 








(Each column contains the correlation coefficient with its probable error. 
Column 1, indicates the subject tests; column 2, the obtained reliabilities of the 
various tests used; column 3, the correlations between mental ages and subject 
ages; column 4, the reliability coefficients for achievement quotients obtained with 
scores on the mental test used separately; column 5, gives the reliability coefficients 
of achievement quotients obtained by using the average score of the mental tests.) 





1.85 is the correlation between comparable forms of the mental test, and .92 is 


the coefficient obtained by use of the Spearman-Brown formula for a test twice as 
long as the one used. 
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form A of the intelligence test. The second set of achievement 
quotients was computed in a similar manner using the ages found on 
form B of the respective tests. 

Two other sets of achievement quotients were calculated for each 
pupil using the average mental age obtained on the two forms of the 
mental test. This made the reliability of the mental ages comparable 
to those obtained on a similar test twice as long as the one used. This 
procedure was followed to show the effect of increased reliability of 
the mental test upon the reliability of the achievement quotient. The 
desired reliability coefficient was found by using the Spearman-Brown 
formula. 


INFLUENCE OF MENTAL TEstT RELIABILITY 


A comparison of the coefficient: of correlation for achievement 
quotients in the various subjects (columns 4 and 5 of Table I) indicates 
clearly that a definite relationship exists between the reliability of 
the achievement quotient and the reliability of the intelligence test 
used. Without exception the achievement quotient reliability 
coefficients are larger (column 5) where the mental test reliability is 
.92, than when such reliability is only .85 (column 4). This indicates 
that the achievement quotient becomes more reliable as the reliability 
of the intelligence test increases. 


INFLUENCE OF ACHIEVEMENT TEsT RELIABILITY 


The effect of achievement test reliability upon the reliability of 
the achievement quotient is not quite so obvious. An examination 
of the data in columns 2 and 4 of Table I indicates a trend toward a 
directly proportional relationship, but exceptions occur. One finds, 
for example, that ‘‘Reading: Word meaning” is more reliable (.83) 
than ‘Arithmetic: Computation” (.75), but that the achievement 
quotient calculated from the ‘‘ Arithmetic: Computation”’ test scores 
possesses a higher degree of reliability (.63) than does the achievement 
quotient derived from ‘‘Reading: Word meaning” test scores (.55). 
This relationship is due, at least in part, to the influence of the degree 
of correlation between subject test scores and intelligence test scores. 
A portion of Table I.is here reproduced to show that when the correla- 
tion between subject ages and mental ages remains constant (column 3) 
the reliability of the achievement quotient varies in the same direction 
as the reliability of the test. 
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TaBLE I[A.—RELIABILITY COEFFICIENTS 


























Stanford achievement tests Ted Fane (AQ)rab 
1 2 3 4 
Reading: Word meaning:..................... .83 .66 .55 
Reading: Sentence meaning................... .73 .67 .46 





The tests, ‘‘Reading: Word meaning”’ and ‘‘ Reading: Sentence 
meaning,”’ have reliability coefficients of .83 and .73 respectively 
(column 2). The factor of correlation between subject ages and men- 
tal ages remains practically constant (column 3), and the reliability 
‘coefficients for the derived achievement quotients are .55 and .46 
respectively, indicating that, other things being equal, the reliability 
of the educational test does influence the reliability of the resulting 
achievement quotient in the same direction. 


INFLUENCE OF INTERCORRELATION OF MENTAL AND SuBJECT TESTS 


To illustrate the influence of the size of the correlation coefficient 
between subject ages and mental ages, another portion of Table I 
has been reproduced to facilitate explanations. The obtained reli- 
ability coefficients of the tests, ‘‘Arithmetic: Computation” and 
“‘Reading: Paragraph meaning,”’ are identical (column 2), and the 


TaBLeE [B.—RE.LIABILITY COEFFICIENTS 

















Stanford achievement tests Tob Tms | (AQ)rap 
1 2 3 | 4 
Arithmetic: Computation..................... .75 | 45 .63 
Reading: Paragraph meaning......:........... .75 | .70 .37 











same intelligence test was used (.85). Assuming that the reliability 
of the achievement quotient is affected only by the respective reli- 
abilities of the educational test and the mental test, one should expect 
identical reliability coefficients for the AQ’s secured by means of the 
above mentioned tests. Such is not the case for the ‘Arithmetic 
achievement’’ quotient has a reliability of .63, while that of ‘: Reading”’ 
is only .37. Apparently the only other factor involved is the inter- 
correlation of subject ages and mental ages. The intercorrelations in 


j 
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these instances are .45 for ‘‘Arithmetic”’ and .70 for ‘‘Reading.”’ 
This indicates that as the correlation between achievement scores and 
intelligence scores increases, or as educational age approaches mental 
age, the reliability of the achievement quotient decreases. It follows 
that, as the tests approach each other as to similarity of material, the 
achievement quotient becomes a less reliable index. It is quite 
generally agreed that scores on intelligence tests are more dependent 
upon reading ability than on proficiency in any other school subject. 
This idea was substantiated in this study by the fact that correlations 
between reading test scores and intelligence test scores are consistently 
higher than those obtained between any other subject test scores and 
mental test scores. It will be noted further that the achievement 
quotient reliability coefficients for reading are lower than for the other 
subjects. Obviously as the correlation between educational tests 


and mental tests increases, the reliability of the achievement quotient 
decreases, and vice versa.? 


APPLICATION OF~HOLZINGER’S FoRMULA? 


A modification of Holzinger’s formula for determining the reli- 
ability of ratios may be used in calculating achievement quotients 
if the ratios of the standard deviations to the means are approximately 


equal, or when suitable choice of origins is made. The formula 
becomes: 


Rs’ 8 _ Rs's a 2 Rm'm"” — Rs'm " — Rs''m' 





m’ m” Via ~ Rs'm’)(1. _ - Rs''m") 





where 

R = correlation coefficients. 
= subject ages obtained on form A of achievement tests. 
= subject ages obtained on form B of achievement tests. 
= mental ages obtained on form A of intelligence tests. 
m'’ = mental ages obtained on form B of intelligence tests. 


A further simplification of this formula occurs when Rs’m’ = 
Rs"'m"’ = Rs'm"’ = Rs''m’. When these correlation coefficients are 
approximately equal Holzinger’s formula becomes: 


Rs’ 38" _ Rs's'" + Rm'm" — 2Rs'm’ 





m’ m” 2(1 - Rs'm’) _ 


A comparison of the reliability coefficients obtained by actual 
correlation of achievement quotients (Table II) and those obtained 
by means of Holzinger’s simplified formula shows that such modifica- 
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tion of the formula applies to these data. Coefficient differences are 
not great enough to be significant. 


TaBLe II.—ACHIEVEMENT QUOTIENT RELIABILITY COEFFICIENTS OBTAINED BY 
AcTUAL CORRELATION AND BY MEANS OF A MODIFICATION OF HOLZINGER’S 
FORMULA FOR DETERMINING THE RELIABILITY OF RATIOS 





By By 
correlation formula 


(AQ)rao (AQ)ras 


l 
| 
} 
Stanford achievement tests | 
| 








—— ee . _ — —_—=! 


ee ee Sa bt We Cae pease kek bee’ | .74 .78 








eS EE NE, ck cs wculscce becncdnease's | .55 .53 
Arithmetic: Computation..................cccceees. | 63 .64 
‘Reading: Paragraph meaning..............:......04. | 37 | 33 
Reading: Sentence meaning......................05- | 46 36 
EO eee | . 50 : 46 





(The first column of coefficients were obtained by actual correlation of achieve- 
ment quotients. Those in the second column were calculated by Holzinger’s 
formula.) 


The effect of varying degrees of correlation between subject ages 
may be noted in this formula by holding the reliabilities of the tests 
constant, and substituting numerical values for the other coefficient. 


Assuming reliability coefficients of .85 for each test one obtains the 
following results when the other coefficient is varied: 


Rs’ s — .85 + .85 — 2(.85) 














a eo ————— = OO 
m’ m” 2(1 — .85) 
_ 85 + .85 — 2(.80) _ 25 
2(1 — .80) 
85 + .85 — 2(.70) 
aa 2(1 — .70) wows 
_ 235 + 85 — 2.50) | ., 
ian 
—_ 85 + .85 — 2(.30) = .79 
a. ey i, 


EFFECT OF GRADE PLACEMENT 


An attempt was made to show the effect of grade placement upon 
the reliability of the achievement quotient. Assuming that pupils, 
found in a grade where the level of work done corresponded to their 
mental maturity (MA), or where the rate of increase in difficulty 
was parallel with their rate of mental growth (IQ), were better placed 
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than the class as a whole, then the achievement quotients of pupils so 
placed should indicate the effect of placement. 


DETERMINATION OF LEVELS AND RaTEs OF LEARNING 


Since the mean educational age for Grade VIA of Racine was 
157 months, this was deemed the best available measure of the level 
of difficulty maintained for that grade. Allowing a twelve month 
range, pupils having a mental age between 151 and 163 months were 
considered better placed than other members of the grade with respect 
to mental maturity. Dividing the mean educational age (157) by the 
mean chronological age (151) of pupils of this system should indicate 
the degree of brightness to which school progress was adapted. Such 
division gives a mean intelligence quotient of 105. If a range of ten 
points is allowed, pupils, whose intelligence quotients fall between 
100 and 110, should be better placed with respect to rate of progress 
than others in the same grade. 

Three groups were selected from the total number of cases on the 
basis of mental ages and intelligence quotients. One group was 
composed of pupils having mental ages between 151 and 163 months, 
another group of those pupils whose intelligence quotients were 
between 100 and 110, and a third group of those pupils having both 
mental ages and intelligence quotients within the ranges mentioned. 
Correlation of the achievement quotients of each of those groups 
shows that, for ‘‘Reading,’’ (Table III) placement with respect to 
either mental age or intelligence quotient tends to lower the reliability 
of the ratio. In “Arithmetic,’’ placement according to mental age 
tends to decrease the reliability of the achievement quotient, while 
placement according to intelligence quotients tends to increase it. 
The reliability of the achievement quotient for ‘‘Spelling”’ seems to be 
increased by improved placement either with respect to mental ages 
or intelligence quotients. Improved grade placement, with respect to 
mental maturity or rate of mental growth, tends to increase the relia- 
bility of the achievement quotients for subjects possessing a low 
correlation with intelligence, and to decrease the reliability of the 


achievement quotients for subjects having a high degree of correlation 
with brightness. 


THEORETICAL Basis OF ACHIEVEMENT QUOTIENTS 


The results of this study lead to the conclusion that the use of the 
achievement quotient technique involves the assumption that accom- 








6 SR ae : 
. 5 ~~ - <2 . say ee 
pre S ES OE EERE 4 
‘- > _- D4 — i — x = a — & bs 
= a = ’ : 


— 


@iers> ec: pe 


™“; 





Py tte reir gre 


he 


——— 
=i 


mos * 
- oF 


a 


ce! 


eh 2 


ot a ae 
Sa Se A SNR Soe 





356 The Journal of Educational Psychology 


tion of the formula applies to these data. Coefficient differences are 
not great enough to be significant. 


TaBLe I].—ACHIEVEMENT QUOTIENT RELIABILITY COEFFICIENTS OBTAINED BY 
AcTuAL CORRELATION AND BY MEANS OF A MODIFICATION OF HOLZINGER’S 
FORMULA FOR DETERMINING THE RELIABILITY OF RATIOS 











| By | By 

Stanford achievement tests | correlation | formula 

| (AQ)rao (AQ)ras 
NO he ESC ee anh Geb eeke Week wb eee | .74 | .78 
ne a a ae Ce | 55 | .53 
Arithmetic: Computation..................c.eeee08- .63 | .64 
-Reading: Paragraph meaning....................... | 37 .33 
Reading: Sentence meaning.....................0-.. .46 | 36 
na av nwnnovescesevuscsess | 50 | .46 





(The first column of coefficients were obtained by actual correlation of achieve- 
ment quotients. Those in the second column were calculated by Holzinger’s 
formula.) 


The effect of varying degrees of correlation between subject ages 
may be noted in this formula by holding the reliabilities of the tests 
constant, and substituting numerical values for the other coefficient. 


Assuming reliability coefficients of .85 for each test one obtains the 
following results when the other coefficient is varied: 
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EFFECT OF GRADE PLACEMENT 


An attempt was made to show the effect of grade placement upon 
the reliability of the achievement quotient. Assuming that pupils, 
found in a grade where the level of work done corresponded to their 
mental maturity (MA), or where the rate of increase in difficulty 
was parallel with their rate of mental growth (IQ), were better placed 
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than the class as a whole, then the achievement quotients of pupils so 
placed should indicate the effect of placement. 


DETERMINATION OF LEVELS AND RaTEs OF LEARNING 


Since the mean educational age for Grade VIA of Racine was 
157 months, this was deemed the best available measure of the level 
of difficulty maintained for that grade. Allowing a twelve month 
range, pupils having a mental age between 151 and 163 months were 
considered better placed than other members of the grade with respect 
to mental maturity. Dividing the mean educational age (157) by the 
mean chronological age (151) of pupils of this system should indicate 
the degree of brightness to which school progress was adapted. Such 
division gives a mean intelligence quotient of 105. If a range of ten 
points is allowed, pupils, whose intelligence quotients fall between 
100 and 110, should be better placed with respect to rate of progress 
than others in the same grade. 

Three groups were selected from the total number of cases on the 
basis of mental ages and intelligence quotients. One group was 
composed of pupils having mental ages between 151 and 163 months, 
another group of those pupils whose intelligence quotients were 
between 100 and 110, and a third group of those pupils having both 
mental ages and intelligence quotients within the ranges mentioned. 
Correlation of the achievement quotients of each of those groups 
shows that, for ‘‘Reading,’’ (Table III) placement with respect to 
either mental age or intelligence quotient tends to lower the reliability 
of the ratio. In “Arithmetic,” placement according to mental age 
tends to decrease the reliability of the achievement quotient, while 
placement according to intelligence quotients tends to increase it. 
The reliability of the achievement quotient for ‘‘Spelling’’ seems to be 
increased by improved placement either with respect to mental ages 
or intelligence quotients. Improved grade placement, with respect to 
mental maturity or rate of mental growth, tends to increase the relia- 
bility of the achievement quotients for subjects possessing a low 
correlation with intelligence, and to decrease the reliability of the 


achievement quotients for subjects having a high degree of correlation 
with brightness. 


THEORETICAL Basis OF ACHIEVEMENT QUOTIENTS 


The results of this study lead to the conclusion that the use of the 
achievement quotient technique involves the assumption that accom- 
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TaBLeE III].—SHOWING THE EFFECT OF PLACEMENT ACCORDING TO MENTAL AGES 
AND INTELLIGENCE QUOTIENTS UPON THE RELIABILITY OF THE AQ 














(MA (Un- 
Tests (MA) (IQ) | and IQ) | selected) 

Tab Tab Tab Tab 

ea a a talg dnb ond oe epee 43 55 . 36 . 54 
Standard deviation.............. eee 9.27 8.15 8.84 
oe aha swe’ de berane ¥ ave wie .§2 .68 .60 .63 
Standard deviation................... 8.94 9.36 | 8.52 9.79 
at ae cans iS om k ale vee s .70 .80 .73 .74 
ST ere 8.74 10.03 8.30 9.95 




















Read table as follows: The reliability of the achievement quotient in Reading 
’ was .43 for pupils having mental ages between 151 and 163 months; .55 for pupils 
having intelligence quotients between 100 and 110; and .36 for pupils having both 
mental ages and intelligence quotients within the range mentioned. The relia- 
bility of the achievement quotient for the entire group in Reading was .54. 


plishment in a particular school subject, and intelligence are distinctly 
separate traits. Measures of such specific traits should correlate 
zero with each other. The introduction of a general factor, either as 
an actual human characteristic or as similar material in the tests used, 
tends to invalidate the achievement quotient procedure. 

The statement that the ideal educational situation is to have the 
educational age approach the mental age is not supported by our 
present knowledge of the nature of the achievement quotient, for as 
these two measures approach each other, the reliability of the ratio 
is bound to decrease. Since, statistically, an unreliable measure 
cannot be valid, it follows that the achievement quotient should be 
discarded, unless, under certain controlled conditions, it is found to 
be better than any other available measure of efficiency. There seems 
to be little sense in the alternative of trying to make the educational 
achievement of a pupil vary as much as possible from his mental abilit y. 


PossiBLE Errect oF PUSHING PUPILS 


The results of ‘‘pushing”’ pupils to make their educational ages 
agree with their mental ages cannot be measured satisfactorily by 
means of general achievement quotients. When the educational age 
approaches the mental age the achievement quotient becomes a very 
unreliable measure of efficiency. If the “pushing” process succeeds 
in decreasing the differences between EA and MA, one should expect a 
decrease in the reliability of the achievement quotient. 
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TaBLE 1V.—ACHIEVEMENT QUOTIENT RELIABILITIES WHicH May Be Expectrep 
with GIvEN Test RELIABILITIES AND VARIOUS INTERCORRELATIONS BE- 
TWEEN MENTAL AND EDUCATIONAL TESTS—ASSUMING COMPARABLE 



































MEASURES 
ee Correlation between tests 
Average reliability 

7 tte 00 | 10 | 20 | 30 | .40 | 50 | 60 | 70 | .80| .90 
.98 98 | .98| .98| .97| .97/| .96| .95 | .93| .90] .80 
96 96 | .96| .95| .94| .93 | .92| .90| .87]| .80]| .60 
94 .94| .93 | .93 | .91 | .90| .88 | .85 | -80| .70 
92 92 | .91| .90| .89| .87 | -84| .80| -73| 60 
.90 .90 | .89 | .88 | -86 | .83 | -80) .75 a ” 
86 g6 | .85 | .83| -80) .77 72 | .65 | - 
84 "ga | .82| .80 | -77 | .73 | -68 | .60 
82 2 | .80| .78 | -74| 70) -64| .55 
80 ‘g9 | .78| .75 | -71 | -67 .60 | .50 
78 78| .76| .73| -69| .63 | -96 
76 76 | .73 | .70 | -66 | .60 | -52 
74 74| .71 | .68 | -63| .57 
72 72 | .69| .65 | -60| .53 
70 790 | .67 | .63 | -57 | .50 
65 65 | .61 | .56 | -50 
60 60 | .56 | .50 
55 55 | .50 
.50 50 


























Use table as follows: Find the heading which represents the correlation between 
the mental test scores and the subject test scores (.60), and follow down the 
column to the reliability desired for the achievement quotient (.90). The number 
at the extreme left of this line (.96) indicates the probable minimum reliability 
required of the tests used. (Reliability coefficients below .50 were omitted on 


account of not being significant for any purpose.) Table is based on Holzinger’s 
formula. | 


ACHIEVEMENT QUOTIENTS FOR INDIVIDUAL DIAGNOSIS 


Assuming a normal distribution of mental’ traits and equivalent 
measures of mental ability and educational status, Table IV indicates 
the degree of reliability required of the tests used to reasonably expect 
to secure a desired reliability for the achievement quotient, when the 
correlation between the different test scores varies. Considering a 
reliability of at least .90 necessary for individual diagnosis, it will be 


seen that when the tests correlate as high as .50 or .60 with each other | 
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their average reliability coefficients must be .96 or more before the 
resulting achievement quotient may be used for individual measure- 
ment. Such a degree of reliability is approached by few group tests. 
This leads to the conclusion that, for individual diagnosis, the achieve- 
ment quotient should be based upon individual tests or limited to those 
subjects having a low correlation with intelligence. It is possible 
to obtain achievement quotients from group tests sufficiently reliable 
for group measurement. 


CONCLUSIONS 


1. The reliability of the achievement quotient increases with an 
increase in the reliability of the intelligence test. 

2. The reliability of the achievement quotient increases with an 
increase in the reliability of the educational test. 

3. The reliability of the achievement quotient decreases with an 
increase in the intercorrelation of mental test scores and educational 
test scores. 

4. Achievement quotients are less reliable for reading than for 
other subjects due to similarity of test materials. 

5. Improved grade placement, with respect to mental maturity or 
rate of mental growth, tends to increase the reliability of the achieve- 
ment quotients for subjects possessing a low correlation with intelli- 
gence, and to decrease the reliability of achievement quotients for 
subjects having a high degree of correlation with brightness. 

6. The use of the achievement quotient technique involves the 
assumption that accomplishment in a particular school subject and 
intelligence are distinctly separate traits. 

7. Results of ‘“‘pushing” pupils to make their educational ages 
equal their mental ages can not be measured satisfactorily by means of 
general achievement quotients. 

8. Achievement quotients sufficiently reliable for individual diag- 
nosis can not be derived from a single administration of present group 
tests, except for subjects having a low correlation with intelligence. 
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SOME SUGGESTIONS ON LEARNING FROM THE 
POINT OF VIEW OF GESTALT PSYCHOLOGY 


R. RAY SCOTT 
West Virginia Wesleyan College 


The learning process is primarily a process of learning how to 
think. I say primarily, for there are certain desirable simple skills, 
or habits, which can be acquired by means largely devoid of thinking, 
and there may be certain emotional responses which are similarly 
cultivated. There is a growing conviction that too much emphasis 
has been placed on the réle of habit in human life, and on the corre- 
sponding method of drillin education. The reaction is taking the form 
of a serious questioning of the Thorndikian psychology which has so 
deeply influenced our educational procedure in the direction of habit 
formation. The definiteness and plausibility of Thorndike’s so-called 
“laws of learning,’”’ and the satisfying simplicity of his famous couplet 
‘“‘stimulus-response”’ have led to almost universal approval for a kind 
of training which calls for little thinking. 

Mr. O. K. Cornwell, of Wittinberg College, has made the interesting 
suggestion that Thorndike’s laws can be condensed into one—the 
Law of Exercise—which may be stated thus: A modifiable bond which 
is exercised will be thereby strengthened, and one which is not used 
will thereby be weakened. It seems to me that that is a rather heed- 
less brushing aside of the Law of Effect! with which Thorndike supple- 
ments Exercise. These two laws must be taken together for an ade- 
quate description of learning according to Thorndike. Without the 
Law of Effect we would have no explanation of why certain acts (the 
successful ones), are selected for repetition rather than other ones. 

Assuming, then, that Exercise and Effect are the heart of the 
learning process for Thorndike, what are we to say as to their ade- 
quacy? I have already indicated that the main line of objection 
lies in the mechanistic tendency of this theory. We can grant that 
pain and pleasure look like non-mechanical factors, but they never 
function to create any new behavior; they act simply to revive a 
successful act already performed as a result of trial and error among 
pre-existent bonds. Where the creative and purposive elements 
do not appear, we are bound to see only mechanism. Simply to 
call a theory mechanistic, however, does not disprove it as a true 





1 If a reaction leads to a ‘‘satisfactory state of affairs,” the connection involved 
in the reaction is strengthened, whereas if it leads to an “unsatisfactory state of 
affairs’’ the connection is weakened. 
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explanation. Let us see what argument will accomplish. In its 
extreme form the Law of Exercise becomes Watson’s Law of Frequency. 
“~Tt is based upon the proposition that ‘‘if one assumes that all possible 
acts are equally probable at the start, and that one order of acts is as 
probable as any other, it follows that the right act has double the 
probability of any act that is wrong.”! But can we make this assump- 
tion? Anyone who watches an animal, not to mention a person, 
trying to accomplish an act may see it perform an unsuccessful act 
several times in succession. This will result in a wrong act being 
performed more frequently than the right one. Watson was probably 
aware of this objection when he formulated his Law of Recency 
(the last act performed is more likely to reappear first); and certainly 
. Thorndike had it in mind when he supplemented Exercise with Effect. 
It looks, then, as though the mechanistic theory of learning rests 
its case in Effect and Recency. Concerning the latter we may say 
that so far as anybody knows it may be true that the most recent act 
performed has a prior claim when the individual is placed in the 
same or similar situation, but the pertinent question is: How did this 
successful act come to be performed in the first place? If chance 
is the only factor what is to prevent an individual from running through 
the whole repertory of his acts in every problematic situation? If 
we assume that an individual, whose possible acts in a given situation 
total 100 makes 1000 trials, and that a certain five acts must be per- 
formed in a given sequence to attain success, the probability of a 
successful performance would be expressed by a quotient whose 
numerator would be 995 and whose denominator would be 100 raised 
» to the one thousandth power. Would it not be miraculous if correct 
solutions were ever achieved where the law of probability reigns? 
The pain and pleasure theory is no more successful in explaining 
’ why the right act should occur in the first instance; and it has the 
additional handicap of having to explain how an end response can 
operate retroactively to influence the prior stages of a series. I 
think we could find no difficulty in accepting all the laws or principles 
discussed up to this point provided they are represented as subsidiary 
and partial explanations of the learning process, and provided they 
are interpreted from some point of view which does not rely upon prob- 
ability or mechanism. 
~~ We must find some explanation of how an organism learns to 
circumvent the law of chance and find solutions with a minimum of 





1 Koffka, Kurt: ‘‘The Growth of the Mind.” 
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experimentation. Obviously there must be some guiding principle, 
some factor that limits the field of trial and error, some recognition 
of appropriateness between acts performed and the end to be attained. 
We find this in purpose and insight. Every act of true learning is 
characterized by a purposiveness which acts as a principle of organ- 
ization of the component parts. Even the behavior of an animal 
imprisoned in a puzzle-box shows some degree of purposiveness, for 
there is a definite relation between violent movements and escape 
from physical restraint. Even assuming that a large element of 
chance is present in the first escape (which is conceivable where 
the problematic set-up is too complicated for the animal’s intelligence), 
the subsequent escapes will certainly be characterized by some per- 
ception of the relation between the previous successful acts and the 
escape. This perception of pertinency will explain those of Thorn- 
dike’s learning curves which exhibit a sudden drop. Thorndike 
himself tries to explain away this type of curve by saying: ‘‘Of course 
where the act resulting from the impulse is very simple, very obvious, 
and very clearly defined, a single experience may make the association 
perfect, and we may have an abrupt descent in the time curve without 
needing to suppose inference.’’' It is precisely this ‘‘obviousness”’ 
which makes the claim for insight hold water. 

Let us now inquire into the bearing of purpose and insight on habit 
formation. With Thorndike and his school, habit is a matter of 
strengthening bonds through exercise or repetition. In the view 
which I am presenting habit formation is a process of facilitating the 
ease of a reaction, or reducing the time of its performance, by fitting 
it into a scheme of understanding. Plausibility for this explanation 
can be found in Koffka’s theory of configuration, which he defines thus: 
‘“‘Such a coexistence of phenomena in which each member carries every 
other, and in which each member possesses its peculiarity only by 
virtue of, and in connection with, all the others, we shall henceforth 
call a configuration.”? Repetition for strengthening bonds disappears, 
and we get in its place practice for the ‘“‘formation of a figure.”” In 
the light of this interesting and fruitful conception what can we say 
of the case where children are led through drill exercises in which they 
have no understanding? The configuration in that case includes the 
motive of pleasing the teacher, enjoying the game or social situation, 





1 Thorndike, E. L.: ‘‘ Animal Intelligence; Experimental Studies,’’ N. Y., 1911, 
pp. 43-44. 


' 2? Koffka, Kurt: ‘‘The Growth of the Kind,” pp. 131-132. 
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etc. A satisfactory conclusion of the behavior from the child’s point 
of view can therefore be effected without understanding the signifi- 
cance of the elements drilled. Something may be learned from the 
exercise, but it is not what the teacher thinks he is teaching. Con- 
figurational drill with purpose in the figure will be intelligent drill. 
On its lowest level, such as memorizing nonsense material, it will 
involve such perceptual patterns as rhythm and space relations; on 
its highest level it will involve understanding or insight. Habits 
formed in this way may be used as tools of thinking, the creative ele- 
ment residing in their reorganization for a specific situation. 

With the reader’s indulgence, we shall turn our attention at this 
point to the problem of learning to think. Thinking is primarily 
-a matter of using concepts, and education for thinking is nothing 
more than education in thinking. We may define a concept as a per- 
ception of relationship which is not immediately present to the 
senses. Where does this perception come from? The answer of the 
older psychology was that it came from the particularization of a 
preformed latent bond. To the gestaltist, a concept is a construct; 
it is something new to the individual created through an act of insight. 
This removes the problem one degree, for we still have to define 
insight. Insight may be thought of as a function of the configuration. 
Life is a moving thing, and it moves in the direction of solutions or 
adaptations. Given a situation which calls for insight, the insight 
comes to complete the figure. Of course it does not always come, but 
it is impossible to conceive of an intelligent creature devoid of some 
measure of insight. The degree of success obtained will depend upon 
many factors, such as physical heredity, mental heredity, health, 
emotional state, etc., the relative influence of which is not known. 
Concepts are not static, notwithstanding they are more or less crys- 
tallized into language forms. Through insight, generally facilitated 
by application, the understanding of the concept is enlarged. The 
significance of the growth of concepts for adaptative behavior may be 
seen when it is reflected that at a given point when a certain datum is 
present as a stimulus, the individual’s understanding of the datum is a 


real part of the stimulus.! In this way is a functional unity of individ- 
ual and environment effected. 





1 In speaking of Kohler’s experiment No. 3, Koffka, on page 191 of his ‘‘Growth 
of the Mind,” says: ‘‘As a necessary condition for a correct type of behavior an 
alteration must occur in the object of perception. What at the beginning possessed 
only the character of ‘indifference’ or ‘something to bite upon,’ etc., now obtains 
the character of a ‘thing to fetch fruit with.’”’ 
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I have already mentioned that the way to learn to think is to 
practice it. Now it will be apparent that no general capacity to 
think can be cultivated unless meanings acquired by insight in one 
connection can be seen to apply in other connections. That is, the 
transfer of meanings is of the essence of learning to think. This 
may be illustrated by the two following examples: A girl of six was 
taken on the toboggan at an amusement park. The boat in which 
she sat was pulled up an inclined plane, and when at the top its direc- 
tion was reversed by means of a turntable. The manner of turning 
the boat about must have attracted notice, for the whole configura- 
tional set-up must have been conducive to the seeing of the problem: 
How is this long boat going to get turned around so as to start us down 
this runway which is immediately adjacent to the one we are now 
travelling? However that may be, thirteen days later the child, while 
riding in an automobile, abruptly posed the question: ‘‘How are 
trains turned around?” Her aunt replied, ‘‘ By means of a turntable.”’ 

‘‘What’s a turntable?” was the child’s next question. 

‘Well, it’s a sort of platform which is turned around while the 
engine is on it.” 

“‘Oh!” said the girl, ‘‘I know, like the boat.” 

The transfer in this typical situation is quite evident. Is it not 
equally evident that the concept of ‘‘turntable” received some 
enrichment from the second experience? 

The second illustration is much more complicated. While in 
process of typing this paper the writer’s machine developed a mechani- 
cal defect which made it impossible to print the capitals. A super- 
ficial observation revealed that the ribbon-carrier would not lift to 
meet the type when the shift key was used. The method of solving 
the difficulty was characterized by motor experimentation controlled 
or guided by analysis and synthesis, which processes were purposive. 
Starting with the obvious causal relation between pressing a key and 
the lifting of the ribbon-carrier in the properly functioning typewriter, 
I inferred that something had become unhooked or otherwise malad- 
justed somewhere between the key and the ribbon-carrier. Therefore 
I began a process of tracing the complicated mechanical relationships 
until I finally arrived at the source of trouble. The phrase ‘‘tracing 
the complicated mechanical relationships’’ covers about two hours of 
concentrated mental labor in which there was a large amount of con- 
trolled experimentation. It is not my purpose systematically to 


analyze the thinking process involved in this illustration, but rather . 
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to call attention to the rédle of concepts or meanings transferred from 
other contexts. Some of the concepts employed were: 

1. The idea “‘I must proceed in a systematic intelligent way if I am to win 
success.” 

2. Mechanical causality. 

3. The various lever concepts. 

4. The idea of the direction of force when inclined planes are brought together 
in various ways. 

5. The ideas of friction and that it will produce wearing. 

6. The idea of the action of a spring. 

7. The idea of the functional unity of a machine. 

10. Ete. 

It requires no unusual perspicacity to see that a person addressing 
‘himself to the task of repairing a typewriter would be unable to 
make much progress if he came to his task without a knowledge of 
the concepts enumerated, or if he possessed these concepts only in 
a particularized context, like the little girl who when asked by her 
grandfather, ‘‘How many fingers have I?’ replied, ‘‘I don’t know; 
I can only count my own fingers ”’ (“‘Growth of the Mind,’ page 334). 
The likelihood of performing by chance such a complicated task as the 
one indicated need not be given serious consideration. 

In conclusion I return to the proposition with which I began, 
that learning is primarily a process of learning how to think. We 
have seen how the purposive character of behavior will redeem habit 
formation from the mode of formal mechanical repetition. We have 
seen how purposive behavior is best explained as the function of mental 
patterns or configurations in which the organism reacts to a total 
situation with a certain element standing out as an organization 
principle. Finally, we have seen that growth in the capacity to think 
is brought about by the acquiring, through insight, of concepts, and 
their elaboration through transfer. 
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A STUDY OF THE MENTAL GROWTH OF DULL 
CHILDREN 


L. R. WHEELER 
East Tennessee State Teachers College 


The problem of mental growth is one of the most difficult with 
which the educator has to deal. Probably this is one of the reasons 
why there has been such a limited amount of writing on the subject. 
We have been greatly handicapped in making a scientific study of 
mental development by not having accurate and scientific measuring 
devices which are very essential and fundamental for an accurate 
study. The writer realizes that the technique of measuring and the 
scales for measurement used in this and similar studies are subject to 
variation, but in this study we maintain that the results will show a 
general trend for the group which will be valuable and significant to 
the educator. 

There are many problems of mental growth which we shall not 
be able to solve. For example, we do not know how much a child 
grows in a month or in an interval of less than a year, but investigators 
have assumed that individuals follow the concomitant theory of growth 
between the annual measurements. There is also a great need for 
more investigations of the annual mental growth of preschool children 
which would aid in making certain predictions of the results of growth 
after the child enters school. We are not sure of just how the child 
develops in mental growth during the adolescent period, but there are 
studies which give us some light upon this problem. Another problem 
of vital interest in the study of mental growth is still in the experi- 
mental period: as to the end point of growth there is a difference of 
opinion. Where does the mental development of the normal child 
end? These problems can be solved only by improving the scales and 
technique of measuring, and in measuring the mental growth of the 
same children at regular intervals over a period of years. 

The origin of the study of mental growth dates back to the work of 
Binet in his attempt to devise some method of measuring intelligence 
with a definite objective measure. After he gave to the psychologists 
the concept of mental age, much progress has been made in studying 
the mental status of children, and has resulted in the mental growth 
measure. Bobertag'! was one of the early pioneers in this work. 





1 Bobertag, O.: Uberingtelligenzprufungen. Zschr of Angew, Psychology, 1912, 
pp. 495-538. 
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The eighty-three children tested were retested the following year and 
the results studied. This study showed that children whose IQ’s 
were above the average made better progress in school subjects than 
those of IQ’s below the average. Bobertag maintained that if 
the examinations are limited to a period of a few years such as eight, 
nine, and ten, one could say that the IQ was approximately constant, 
but that it is questionable whether the IQ will remain constant for all 
ages. 

Berry! found that normal children gained about twelve mental 
months in a year, while feebleminded children with a mental age of 
about five years gained about six months in a chronological year. 
This investigation shows that there may be a wide variation in men- 
tal growth when there are different intelligence levels. Kuhlman? 
studied the results of retests on feebleminded children and found 
that the IQ had a tendency to decrease after the years nine or ten, 
which would indicate that the child’s mental growth became less 
with an increase of chronological age. Terman® studied the retest of 
one hundred and forty children on the Standard Revision of the Binet- 
Simon Test and found that superior children on the first test will 
be superior on the second test, and that feebleminded will maintain 
their former position in the group. Freeman‘ discussed the general 
trend of the mental growth curve as to whether it takes a sudden or 
gradual rise. He shows by a number of graphs the general direction 
of the mental growth curves of normal and subnormal children, and 
sums up his conclusions as follows: ‘‘There seems to be evidence of 
considerable weight that the typical growth curve follows a uniform 
rate within at least the period covered roughly by the elementary 
school.” 

Baldwin and Stecher®' made an intensive study of normal and 
superior children, tested by the Binet-Simon test. The range in IQ of 





1 Berry, C. 8.: Eighty-two Children Retested by the Binet Test of Intelligence: 
Psychology Bulletin, 1913, pp. 77-78. 

2 Kuhlman, F.: What Constitutes Feeblemindedness. Journal of Psychology, 
Asthen, 1915, pp. 214-246. 

3’ Terman, H. L.: ‘‘The Stanford Revision and Extension of the Binet Scale for 
the Measurement of Intelligence.’””’, Warwick and York, 1917, pp. 51-61. 

‘Freeman, F. M.: Interpretation and Application of the Intelligence Quotient. 
Journal of Educational Psychology, 1921, pp. 3-13. 

5’ Baldwin, B. T. and Stecher, L. F.: ‘‘Mental Growth Curve of Normal and 
Superior Children.”’ Studies in Child Welfare, Vol. 2, No. 1, 1922, Published by 
University of Iowa. 
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these groups was from 90 to 167, and these were divided as normal 
with 1Q’s from 90 to 110, and superior with IQ’s above 110. The 
normal group grew approximately twelve mental months a year, while 
the superior children showed a higher rate of mental growth. This 
is one of the most extensive studies the writer has been able to find 
indicating that mental growth curves take a similar direction to 
physical growth curves. The investigators summed up their study 
in the following words: ‘‘The most significant outcome of this study 
is the empirical determination of the mental growth curve and 
the establishment of the close interrelation between physical and 
mental development as shown by the general similarity between 
growth in height and mental age, the rise of the mental age curves at 
the adolescent years, the superior mental development of physiologi- 
cally accelerated children, and the high correlation between mental age 
and height.”’ This study shows some careful research on the problem 
of mental growth, but the writer, from recent studies, has not been able 
to find such a marked or high correlation between mental and physical 
growth. However, all studies show the general trend to be in the 
same direction. This study is in fair agreement with other investiga- 
tions showing that the superior child grows at a more rapid rate as 
measured by our scales and the subnormal child grows slower mentally 
than the normal and superior child. 

Doll! studied the mental growth curves of feebleminded children 
on a series of five tests. He found that feebleminded children seemed 
to reach a period of arrest in their mental growth which the IQ 
decreases. Terman? critized this study and maintained that mental 
growth is a gradual, continuous development from birth. Rugg and 
Colloton* studied the results in retesting children in the Lincoln 
School. They tested each child twice with a year’s interval between, 
using the Stanford Revision of the Simon-Binet test, and summed 
up their results as follows: 


1. That the chances are twenty to one that the first IQ is within 
thirteen points of the true IQ. 


2. That the typical difference of the middle fifty per cent is less 
than six points of the true IQ. 





1 Doll, E. A.: The Growth of Intelligence. Psychological Monthly, 1921, p. 130. 

? Terman, H. L.: Mental Growth and the IQ. Journal of Educational Psy- 
chology, 1921, pp. 325-341. 

* Rugg, H. and Colloton, C.: Constancy of the IQ. Journal of Educational 
Psychology, 1921, pp. 315-322. 
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3. That the giving of a single retest increases the reliability forty 
per cent. 


4. No group of school children shows an average variation of more 
than seven points in IQ. 

The findings in this investigation indicate that the IQ is a fairly 
constant ratio, and the intelligence test is a fairly reliable means of 
determining mental age. 

Wentworth! studied the reliability of group tests as compared with 
the Standard Revision of the Binet-Simon test. Her results were as 
follows: 

1. Correlation between Dearborn General Intelligence A 1922 and 

Binet-Simon, with 575 cases, r = .72 + .014. 
2. Dearborn Intelligence A 1923 and Binet Simon, 575 cases, 
r= .68 + .06. 

3. Dearborn General Intelligence A 1922 with General Intelligence 
A 1923, 575 cases, r = .72 + .013. 

4. Stanford Binet and Stanford Binet retests, 145 cases, r = 
83 + .03. 

The above data indicate that there is a high correlation between 
the results obtained from the Dearborn Group Tests and the results 
of the Binet-Simon Individual Tests. Wentworth’s study? shows 
that results obtained from the Dearborn Group Tests are reliable 
in comparison with data obtained from the Stanford Revision of the 
Binet-Simon test. Other investigations have shown similar findings, 
and the writer feels that he is justified in using the Dearborn Group 
Tests in selecting and studying the mental growth of dull children. 

The data for this investigation were obtained from the Harvard 
Growth Study which was begun in 1922-1923 for the purpose of making 
repeated mental measurements on the same children for a series of 
years. A full and comprehensive description is given by Latshaw.* 
‘“‘The Growth Study seeks, through a consideration of data obtained by 
repeated annual measurements on the same individuals to appraise 
the mental growth of each child. In most of the previous studies 
the individual has been lost in the group, or if individual methods 





1 Wentworth, Mary M.: “Individual Differences in the Intelligence of School 
Children,’’ Harvard University Press, 1926, pp. 20-48. 

2 Wentworth, May M.: Individual Differences in the Intelligence of School 
Children, Harvard University Press, 1926, pp. 20—48. 

3’ Latshaw, H. F.: ‘‘Measurement of Physical Growth.’ Thesis, Graduate 
School of Education, Harvard University, 1925. 
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have been used, the groups have been small, and due to the small 
number of cases, the results have not been as meaningful.”’ 

The data used in this study have been collected and compiled by 
individuals who have been technically trained to carry on research 
in mental measurements of school children. The intervals between 
the different measurements are approximately twelve months. This 
is of vital importance because mental growth should represent as 
nearly as possible the same stage of growth of each individual for 
each year. ‘The writer, in selecting data for this study of the mental 
growth of subnormal children, based his selection on the results 
of the Dearborn Group Intelligence Tests SeriesI and II. All children 
were chosen who had an average IQ for the four years below 90. We 
feel that the children in this group are fairly representative or what one 
might find in the public schools over the country. All races were 
omitted but the North Europeans. The IQ’s of the boys range from 
60 to 90, and of the girls from 51 to 90. Only five per cent of the group 
fall below 69 in IQ. 

The median for the entire group studied is 82.9 which indicates 
that the cases fall mostly within the limits of a dull group. We 
have included children in this study which are below the dull line of 
demarcation but feel that the per cent is not great enough to make 
a marked difference in the results of mental growth. The statistical 
terms indicate that the group is principally dull or what one might 
expect to find in an unselected group of the average school population. 

In making a study of this type on the same children for a period 
of years the number of cases becomes smaller each year due to absences, 
sickness or some other reason which cannot be remedied and we are 
confronted with the problem of a smaller number of cases with an 
increase in chronological age. We are not attempting to study the 
individual child within the group as to the amount gained or lost in 
IQ, or the influence of learning on the test, or the question of the 
validity of the test, but we are attempting to study the gain on 
the basis of mental age as set down in the test. Since this is one of the 
early studies of mental growth on the same children over a period of 
years based upon the results of group tests, we are limited in explaining 
the reason for many of the discrepancies which will arise in the reader’s 
mind as he examines the study critically. 

The statistical terms used in measuring the increments of growth 
are the median, first and third quartiles, and the semi-interquartile 
range, as shown in Tables I, II, and III. In obtaining the mental 
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growth we have used the increment between the first and second test, 
second and third, third and fourth, etc. The results are shown in 
Tables I, II, and III as well as in Fig. 1. 

In studying this group of children we have found no marked sex 
difference as shown in Table III. Because of this fact we have divided 
the data into two groups, disregarding sex. The first group consists 
of children of chronological ages between six and seven. The mental 
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Fra. 1.—Mental growth of dull children. 


growth of these children was studied for four consecutive years. This 
includes the mental growth on the same children from age six through 
nine as shown in Table I. 

Table I shows that the median growth of the six year group for 
the first year was 12.1 months, for the second year 6.9, third year 9.6, 
and a median average of 9.5 mental months each year. It is also 
evident that the general increment of growth as shown by the first and 
third quartiles has the same general trend. We are not able to explain 
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objectively why there is such a wide difference between the first incre- 
ment of growth and the others. 

The second group of children were seven years old chronologically, 
and have been studied for a period of four years, or from seven to 
eleven, as shown in Table II. 

From ages seven to eight this group gained 9.6 mental months, 
eight to nine 5.8, from nine to ten 4.9, and an average of 6.8 mental 
months each year. This group seems to grow at a decreasing ratio 
with an increase of chronological age. The average mental growth 
of the seven year group is also less than the six year group. Since 
this group was tested and scored by the same persons on the same test 
the same year as the six year group, the large difference in mental 
growth of the first year’s increments of each group cannot be accounted 
for on the basis of the technique of measurement. It seems that the 
first and third quartiles indicate the same general trend of the group 
in terms of mental growth as the medians. 

Table III shows the results of the seven and six year old groups 
combined according to age and sex. The group as a whole shows a 
decreasing increment of mental growth as the children grow chronologi- 
cally. The first increment is 11.2, second 8.6, third 6.6, and the 
fourth is 4, with an average of 7.6 mental months for each year. The 
first and third quartiles indicate a similar trend. Figure 1 shows the 
general trend of the group in mental growth which seems to be fairly 
regular from age to age, with a decreasing increment expecially between 
nine and ten. This decreasing increment is accompanied by a wider 
range as shown by the semi-interquartile range, and indicates that the 
dull child develops slower than the normal and superior child. This 
fact suggests that the IQ of the dull child becomes smaller on these 
tests with an increase of chronological age. We regret that our data 
do not include growth of these children for older ages, but probably 
in a later study the measures on these same children will be discussed 
further. 7 

The data show that the average dull child is about one year men- 
tally retarded when he enters school and this retardation increases 
from year to year until at the age of ten to eleven he is retarded 
mentally over two years as shown on Table III. According to our 
classification in school on the basis of chronological age the dull child 
should be in the fifth grade, but on the basis of mental development 
he can master only the third grade. This is an important factor to 
be considered by the school. 
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Figure 1 shows a wide difference between the chronological and 
mental age of the dull children and this difference increases as the child 
grows older in chronological age. There seems to be a divergence in 
the growth curves as shown by the medians of the different ages. 
Table III shows that the dull children enter school with a chronological 
age of six and are retarded mentally about a year. The longer 
they stay in the school the greater the mental retardation becomes. 
The dull boys are retarded the first year 12.3 mental months and this 
increases each year in school. At the age of ten to eleven they are 
retarded 27.5 mental months. At the age of six the girls are retarded 
10.9 and at the age of ten this difference has increased to 30.5 mental 
months. The total group of dull children are retarded 11.7 at age six 
and the retardation increases each year until at the age of ten they are 
29.3 mental months retarded which explains why the child progresses 
so slowly through school. It seems that as the child grows older the 
problem of making the grade corresponding to his chronological age will 
become more difficult. This suggests to the writer the importance 
of a differentiated curriculum which will allow the dull child to make 
progress through the school at a slower rate and on a different level 
from the normal and superior child. These data indicate that the 
dull child will have even greater difficulty above age eleven in making 
his grade and normal progress. The growth as measured by the Dear- 
born tests indicates a smaller increment for the higher ages. Of course 
the school will face even a greater problem during the adolescent period 
and as long as the dull children are under the jurisdiction of the public 
school. We are fairly safe in saying that the schools in Massachusetts 
will compel these children to remain in the educational process at least 
four or five years more which will mean that the mental retardation 
will be quite marked, and we would naturally expect that a large 
number will fall in the class of misfits. The writer believes that this 
group is typical of what one will find in the average American school 
system. The majority of the schools are not making the proper 
provisions to meet this wide divergence in mental growth and retarda- 
tion. The answer to many of the problem ‘children in the school 
is that the school is built too much around the performance of the 
child who grows at a normal rate or about twelve mental months a 
year, and not for the children who deviate from this standard. 
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SUMMARY AND CONCLUSIONS 


1. A survey of the studies of retests on the same child indicates 
that the IQ is fairly constant. 

2.:The number of studies of mental growth of the same children 
over a period of years seem greatly limited, and especially those 
made by the use of group intelligence tests. 

3. The scales used in the past and present time for the measure- 
ment of mental growth are subject to any criticisms and variations 
which make the problem still more complicated. 

4. The data for this investigation have been collected and scored by 

individuals especially trained in the technique of mental measurements. 
. 5. These data consist of public school children selected entirely 
by the use of group intelligence tests. 

6. This investigation shows no marked sex difference in mental 
growth from age six to eleven. 

7. The gain in mental growth of the six year group of children 
shows the same general trend as the seven year group for four con- 
secutive years. 

8. The groups combined show a decreasing increment of mental 
growth from age six to eleven. 

9. This investigation shows that the average dull child is about 
one year mentally retarded when he enters school, and this retardation 
increases from year to year until at the age of ten to eleven he has a 
mental retardation of over two years. 

10. This study of mental growth of dull children emphasizes the 
importance of special classes and a differentiated curriculum for dull 
children. 





8 SOHN SILT RRS 
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A NOTE ON THE ARITHMETICAL ACCURACY OF 
PARTIALS INVOLVED IN MULTIPLE R 


FRANK K. SHUTTLEWORTH 
Yale University 


It is commonly stated that calculation of Multiple R from the 
alternative formule checks all the partial correlations involved. In 
working out a six variable problem the alternative formulae 


1 — R*1.23456 = 


(1 — r*y2) (1 — 113.2) (1 — r?1423) (1 — 115.234) (1 — 1? 16.2345) 
and 


1- RR? .03456 = 


(1 - r* 16) (1 = 15.6) (1 _— 714.56) (1 _— 77 13.456) (1 os 710.3458) 


yielded .6821 and .6815, or a difference of .0006. Is this an adequate 
check? The following procedure was used to obtain a rough answer 
to this question. The calculation of the Multiple R .6814 involved the 
following figures: 


1 — .6815? = 
(1 — .567?) (1 — .387677) (1 — .2565*) (1 — .1016?) (1 — .0717?) 


To discover the error in the fourth order partial which would yield 
a difference of .0006 in the Multiple R, the following substitution was 
made: 


1 — .6821? = 
(1 — .567?) (1 — .8767°) (1 — .2565?) (1 — .1016°) (1 — 2?) 


The fourth order partial by this substitution proved to be .0814. 
That is, a difference of .0006 between the Multiple R’s would allow for 
as much error as the difference between..0717 and .0814, or .0097. 
Similarly a difference of .0006 between the Multiple 2’s would allow 
for as much error in the first, second, and third order r’s as .0017, 
.0027, and .0070. For the problem we were working on, such errors 
were not serious. Accordingly, it was concluded that the Multiple 
R’s .6821 and .6815 adequately checked the problem. 

Some doubts remained, however, and the problem was reworked 
by the writer and Miss Ivy C. Husband. The following procedure 
was used: Starting with the zero orders correct to three places, the 


factors +/ 1 — r? to six places were taken from Miner’s Tables; numera- , 
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tors and denominators of each partial were carried to seven places and 
the sixth corrected from the seventh; the partials were calculated to 
five places and the fourth corrected from the fifth; in calculating Multi- 
ple R nine places were carried. A perfect check throughout was 
obtained between the work of Miss Husband and the writer. The 
alternative formule yielded .68149 and .68151, or a difference of 
.00002. 

Comparison of the revised and presumably correct partials with the 
partials as originally calculated revealed an astonishing number of 
both small and large discrepencies. Of the twelve second order 
partials, all but one showed discrepencies. The two largest dis- 
crepencies were .0124 and .0155, the average being .0035. All of the 
. nine third order partials showed differences. The largest differences 
were .0196 and .0230, the average being .0059. All of the five fourth 
order partials showed differences. The largest was .0447, the average 
being .0110. Nearly two-thirds of the second, third and fourth order 
partials as originally determined were in error by more than as much 
as seemed reasonable to expect from the discrepancy between .6821 
and .6815. 

The data definitely suggest the conclusion that serious errors may 
occur in the calculation of partial correlations even though calculation 


of Multiple R from the same partials by the alternative formule 
provides a nearly perfect check. 
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NOTE REGARDING THE SUBJECT-MATTER 
PROGRESS OF THREE ACTIVITY SCHOOLS 
IN HAWAII—A CORRECTION 


JACK W. DUNLAP ann EDWARD E. CURETON 
Territorial Normal and Training School, Honolulu, T. H. 


In the October, 1929, number of this Journal, there appeared 
an article by Helen G. Pratt and the writers giving the results of a 
study of the progress, as measured by the Stanford Achievement 
Test, of three activity schools. In interpreting these results, use was 
made of a standard error formula derived by the writers. This 
formula contained a serious algebraic error, which was pointed out by 
Holzinger.!. The error was due to carelessness on the part of the two 
present writers alone. 

The results given were therefore in error, and the basic table is 
presented again with the probable errors computed by the correct 
formula; the previous incorrect values being retained for purposes of 
comparison. 























KAWANANAKOA 
N Median CA | Gainorloss | Correct PE | Old PE 
60 9-7 Gain .096 .028 | .029 
56 10- 7 Loss .111 033 | .053 
50 1l- 6 Loss .040 .038 | .079 
49 12- 6 Gain .012 .040 | .087 
“6 060 | 18 5S Gain .20 | 1.052 | ~~ .18 
36020 | 0 14 4) | «Gain 15 | 094 | 28 
WAILUKU 

| | | 
44 | 91 | Gain .113 | .032 | .032 
62 9-11 | Gain .030 | .028 | .037 
85 10-10 | Loss .100 | :030 | .062 
85 11-11 = Gain .076 | 032 | 073 
55 | 12-9 | Gain .135 | .038 | .081 
41 | 13- 9 Gain .188 | .046 | .099 
30 | 14-9 | Gain .07 | 022 | .19 

| | ) 








1The Probable Error of a Difference Formula. Journal Educational Psy- 
chology, Vol. XXI, No. 1, Jan., 1930, pp. 63-64. 
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KAMEHAMERA II 








41 91 Loss .020 .033 .035 
81 9-10 Loss .008 .024 .027 
102 10-10 Loss .001 .026 .053 
87 11-10 Gain .043 .028 .055 
60 12- 9 Gain .06 .024 .10 
46 13— 8 Gain .09 .028 .12 
24 14-7 .Gain .13 .047 .22 














The conclusion given in the previous paper was, ‘‘ These figures 
show that on the things which the Stanford Achievement Test measures, 
there was no significant loss, and in the case of the youngest groups 


‘ in two of the schools, there was significant gain. All three schools 


maintained about the same rate of subject-matter progress under the 
new program as under the old.’”’ If we consider as before that a 
gain or loss whose value is greater than three times its probable 
error is reasonably significant, it may be seen that there are in fact 
two cases of significant loss and seven of significant gain. If we adopt 
a slightly more rigorous standard, and demand that a gain or loss 
exceed three and one-half times its probable error to be considered 
significant, there are no cases of significant loss and four of significant 
gain. The general conclusion that subject-matter gains are at least 


as great under the activity program as under the formal program still 
holds, 
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THE EFFECT OF WEIGHTING EXERCISES IN A NEW 
TYPE OF EXAMINATION! 


STEPHEN MAXWELL COREY 
University of Illinois 


The following is a report of the effect upon grades of assigning 
different weights to items in a new type examination. Work has been 
done by Holzinger, Douglas, Spencer, and others? with respect to the 
correlation between raw and weighted scores, but the more practical 
question of the effect of weighting upon grades has received slight 
attention. 

The new type examination used concerned itself with information 
relative to facts offered in a course in general psychology. There 
were seventy-three items in the test representing the following different 
varieties of new type exercises: (1) Matching; (2) Multiple choice; 
(3) Incorrect statement; (4) Completion. 

All of the items in one of the examinations were correctly answered 
and each converted into statement form. For example, the items in 
the matching exercises were correctly associated, the proper words were 
inserted in the blanks in the completion exercises, and the correct 
alternatives indicated in the multiple choice; each was then expressed 
asastatement. The result of this procedure was a list of seventy-three 
statements of fact concerning general psychology. Six copies were 
made and one given to each member of the educational psychology 
staff, who was asked to evaluate every sentence with respect to its 
importance for a general knowledge of elementary psychology. 

After the seventy-three items had been assigned weights, it was 
possible for each examination to have seven different total scores, one 
of them the raw score, and the others those that would have been 
assigned by the different instructors. One hundred examinations 
were selected at random from the five hundred given, and the seven 
scores for each determined. The following two statistical treatments 
were undertaken: 


1. The computation of coefficients of correlation between the raw scores and 
the weighted scores. 


2. An analysis of the effect of assigned weights upon theoretically determined 
grades. 





1 The writer wishes to thank Professor E. H. Cameron of the University of 
Illinois who suggested this study. 
* See, Journal of Educational Psychology, Vol. XIV, pp. 109, 279. 
383 





Re a a ee 











)\ 


384 The Journal of Educational Psychology 


The coefficients of correlation were derived by using the Pearson 
product-moment formula and appear in Table I. These coefficients 


TasBLeE ].—CoEFFICIENTS OF CORRELATION BETWEEN RAW AND WEIGHTED 


ScorEs 
WEIGHTS ASSIGNED BY CoRRELATION wiTH Raw Scorg PE 
A .88 + .015 
B .88 +.015 
C . 836 + .02 
D .824 + .02 
E .96 +.01 
F .836 + .02 


_of correlation are interpreted to mean that weights assigned by all 


but E noticeably affected the relative total scores of each paper. 
The raw score determinations were not in close agreement with the 
scores of the examinations after the items had been weighted. 

As was stated above, a more practical question than that of the 
correlation between raw and weighted scores is: What effect has weight- 
ing upon students’ marks? This was determined by ranking the 


- hundred papers according to each of the seven methods of scoring, and 


giving the first seven A, the next twenty-four B, the next thirty-eight 
C, the next twenty-four D, and the last seven E. Table IJ summarizes 
the results. 


TaBLeE II.—PerR Cent or Papers GIvEN SAME GRADE USING DIFFERENT 
METHODS OF SCORING 


NoumBer or Dirrerent Mertusous 


Per Cent Given SAME GRADE oF ScoRING 
25 7 
22 6 
26 5 
27 4 


It shows the per cent of examinations given the same grade under the 
different methods of scoring; 7.e., twenty-five per cent were given the 
same grade in all seven instances, twenty-two were given the same 
using six of the possible methods, etc. It is worthy of particular note 
that eight of the papers were assigned three different grades. Table 
III represents the per cent of cases wherein there was disagreement 
between the grade given by the instructor designated and the grade 
determined from the raw score ranking. For example, had the grades 
been assigned upon a raw score basis, D would have disagreed in forty- 
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TABLE III.—PERCENTAGE OF EXAMINATIONS ON Wuicu GRADES ASSIGNED BY 
INstrRucTOoRS DIFFERED FROM Raw Score GRADES 


INSTRUCTOR Per Cent 


30 
22 
38 
29 
30 
49 


oer soOnPr 


nine per cent of the papers by as much as one step in the series of letter 
grades; C on thirty-eight, etc. It is obvious that these per cents are 
quite high, which clearly indicates that competent judges did not agree 
with the raw score evaluations of these examinations. 

Certainly if the objectivity of new type examinations rests with the 
decision to give each item an equal weight, the objectivity is spurious. 
This study seems to suggest that such may be the case, for when 
capable persons weight the items in a new type examination with 
respect to their relative importance, the marks given the papers differ 
significantly from those derived from raw scores. Some items of 
information are naturally more important than others. The reliability 
of raw scores is based upon the assumption that students will always 
know important and unimportant facts in a constant ratio. There is 
not sufficient evidence to believe that this assumption is correct, and 
when it is not, raw score determinations are not reliable. 
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THE ADJUSTMENT OF FREQUENCY DISTRIBUTIONS 


HERBERT 8S. CONRAD 
Institute of Child Welfare, University of California 


Suppose that a phenomenon is most accurately measured by a 
master frequency-distribution, S; other frequency distributions of 
inferior accuracy are A, B, and C. It is required to adjust distribu- 
tions A, B, and C so that og = o4 = og = oc, and Ms = M, = Mz = 
M-. The adjustment of the standard deviations (¢) may be effected 


by multiplying each score by <6, *, and — respectively. The adjust- 
A B Cc 


_ ment of the means (M) is readily effected by the addition of a constant 
appropriate for each frequency distribution. Neither of these oper- 
ations will in any way affect the correlation of the scores in distributions 
A, B, and C with other measures. 

For example: Distribution S consists of the IQ’s of fourteen children 
in a mental-test battery of 6 tests.! Distribution A consists of the 
average of 2-3 nursery-school teachers’ ratings of the intelligence of 
these children. og = 13.8388, Ms = 117.36. o, = 1.0373, M, = 


3.65. rs, = + .751. The adjustment of co, to og requires the multi- 


plication of each score of Distribution A by =, or a a, or 13.3412. 
A ° 


The mean of this new distribution (A’) is 48.70. The adjustment of 
M,' to Msg requires the addition of 68.66 to each score of Distribution 
A’. The final adjusted distribution, A’, is given in column (5). 
The reader can verify for himself that (within the limits imposed by 
the dropping of decimals in the formation of Distribution A’), M4” = 
Ms, 4" = Gg, ANd fsa = Psa". 

The method of adjusting distributions given above makes no 
assumptions whatsoever, except such as are implied in the use of a 
‘“‘master” or “‘standard”’ frequency-distribution, S; and except such 
as are implied in the use of the mean as a measure of central tendency, 
and sigma as a measure of variability. The master or standard 
distribution may or may not be normal, depending only on the empiri- 
cal facts as found by the measuring instrument (or instruments) 
employed. 

Unless distributions S and A differ markedly in skewness—which 
in the nature of the case is not very likely—the method given will 





1The number of tests given each child varies from 5-8. Of this number, 
about half are repetitions (at suitable intervals) of tests given previously. 
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TaBLe | 
(1) (2) ~~ Te (5) 
| 3 
PRIS ae oak) 
aici ae a | Distribution | Distribution 
. A’ Av 
(average IQ)! |(average rating) 

A 87 1.75 23.35 92.01 
B 138 4.65 62.04 130.70 
C 125 4.65 62 .04 130.70 
D 103 3.70 49 .36 118.02 
E 105 1.80 24.01 92.67 
F 115 3.90 52.03 120.69 
G 118 4.40 58.70 127 .36 
H 119 4.60 61.37 130.03 
[ 125 2.85 38 .02 106 .68 
J 123 3.85 51.36 120.02 
K 114 3.95 52.70 121.36 
L 105 2.15 28.68 97 .34 
M 116 4.05 54.03 122.69 
N 143 4.80 64.04 132.70 
Mean... 117.36 3.65 48.70 117.36 
Derik eux 13.8388 1.0373 13.8229 13.8407 

















187 means 87.0-87.9. This fact must be remembered by the reader undertak- 


ing to verify the figures given. For 87 we have in our computations used 87.5; for 
138, 138.5; etc. 


be found to serve its purpose very satisfactorily. The same limitation 
concerning skewness applies equally to the method of Woodworth.' 
The present method will be found shorter than Woodworth’s, since 
no deviations from the mean need to be computed, and only the 
“non-standard” distribution (or distributions) needs to be manipu- 
lated. The present method is also more desirable than Woodworth’s, 
if the scores of the ‘‘standard”’ distribution have acquired a special 
definite meaning to the experimenter, which ordinary sigma scores do 
not possess—e.g., if the standard or master distribution consists 
(as in the illustration above) of IQ’s. 





‘ Woodworth, R. S.: Combining the Results of Several Tests, A Study in 
Statistical Method. Psychological Review, Vol. XIX, 1912, pp. 97-123. 
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MEASURING THE VALIDITY OF PREDICTED SCORES 


HAROLD A. EDGERTON 
Ohio State University 


Among the problems very frequently attacked by psychologists 
and educators is that of prediction. Typically, such investigations 
involve a number of steps. Obtaining an adequate criterion is a 
major problem. More attention, however, has been paid to the 
prognostic devices—measurements, ratings, tests, etc. After having 
secured a criterion and a test or a number of tests that may be used to 
predict the criterion, the method of correlation is employed. Only 
_ simple correlations between the criterion and each test or the multiple 
correlation may be used. Having done this much, some investigators 
consider the job completed. Others will go further and the rest should. 
Two directions may be followed although the second one is much 
preferred. The first is to obtain like measures on a second population, 
correlate the obtained scores and compare the results with those of the 
first sample. The second procedure is to obtain like measures on a 
second population. Then, using the same weightings for the prognos- 
tic measures as were used in the first population, actually predict the 
criterion by means of the test scores. The predicted criterion scores 
may then be compared with the actual criterion scores and their 
degree of likeness determined. Some such procedure is necessary 
before the experimenter can be very certain that his prediction equa- 
tions are valid for use in some second population of which the experi- 
mental population may be assumed to be an adequate sample. 

A usual method of measuring, in the verification or second sample, 
the goodness of such predicted scores is that of correlating them with 
the actual criterion scores. For example: The partial regression 
equation obtained from the first population is used to predict the 
scholarship of the second group. Then when the actual scholarship 
of the second group has accrued, their predicted attainment may be 
correlated with their actual attainment. 

This sort of procedure is a measure of the co-variation of the two 
sets of scores, but it does not account for errors of prediction that may 
be common to all the individual cases. For example, if all the pre- 
dicted scores were five units less than the actual scores, correlating 
the actual scores and the predicted scores would not show such a fact. 
Lack of correlation reveals only those errors of estimation that might 
be termed chance errors. Such chance errors may be measured by 
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use of the standard error of estimate, using the actual regression line 
of the correlation concerned as an origin. The formula for such a 
standard error of estimate is 


Cy.2 = V1 — rr, (1) 
where go, is the standard deviation of the dependent or criterion 
variable and r,, is the correlation between the estimated and actual 
scores. 

If the regression equation derived from data of one population 
be used in estimating scores of a second population, the formula given 
(Equation 1) does not take any account of the constant term of the 
regression equation or any change in the slope of the partial regression 
due to differences in the variability of the samplings of the first and 
second populations. A high correlation between the actual and esti- 
mated scores signifies only that the cases or scores are placed in about 
the same relative positions in each of the two distributions (distribu- 
tion of actual scores and distribution of estimated scores). A correla- 
tion coefficient makes no reference to the likeness or unlikeness of the 
mean values of the two distributions. 

In any investigation of the validity of predicted scores, it would be 
better if the predicted scores were used as the origin for computing the 
mean square error of estimate. Such a measure will be influenced by 
all discrepancies of estimation. 

The mean square error of estimate, using the predicted scores as 
an origin may be written 


‘eo Je: Y)? (2) 








where 


Xo = actual score made by any individual 
Y = corresponding predicted score. 


Further 
Xo = to + Mo 
Y=y+M, 


Substituting the above identities in Equation 2, 
ae fn —y+M,— M,)? 








Squaring both sides of Equation 3 for convenience and expanding the 
right-hand member, 


o? = 2 x0 + y? + Met + My? — 2xey + 2xoMo — 2xeMy — : 
2yMy + 2yM,—2MoM,) (4) 
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Summing and substituting such expressions as No,” for =z’, 
No.0," 2, for zy, and zero for =z, 





c= oo" + oy” + M,? + M,? — 20 0 yT oy a 2M,M, (5) 
Collecting terms and extracting the square root of each side, 
= Vo? + oy” + (Mo md M,)? —_ 26 00 Toy (6) 


Equation 6 is the equation desired. Now we may consider various 
equations for Y. In the fields of psychology and education, the partial 
regression equation is the equation most used for prediction of scores. 
It is linear in all variables. Such an equation is of the form 


Y = bou.es ‘+ nAX1 + dos.13 a ndx2 + se + bon.i2 ‘4 la - DAs 
+ M, — bore i oe nM, — boe.13 . nMe2 — ae 
bon.12 se n—-vM, (7) 
In equation (7) the partial regression coefficients bo:.2; - - - netc., 
are those used for weighting the gross scores so that 
bores + * 2 = Bors: - - n (8) 
For convenience in writing the equations, bo:.23 - - - , will be referred 
to as bi, Dos.134 - « * » aS be, etc., so that Equation 7 will have the 
form 
Y = 6:X1+ boXe+ --- +O0,Xn+ 
| (M, — 6:M; — bo2M:— ---—6,M,) (9) 


The constant terms of the Equation 9, in parenthesis, will be referred 
to by a single term c, hence 


Y = 0:X,+ 0.X¥2+--: +0,X, +0 (10) 
The mean of the distribution of Y is 
M, = 61M, + 0M2+-:--+-+0,.Ma+c (11) 
The standard deviation of the distribution of Y is 
oy = [(b101)? + (b2o2)? + © - © + (Onn)? + Qbibw 0 eri2 
+ 2bibsoios¥13 + - > + + 2b) jo 30 fr; +t+eceo t+ 


2bn—1DnOn—10 nT (n _ Dal” (12) 


[t is now possible to write the general equation for the mean square 
error of estimate, using the predicted scores as an origin. 


tes [oo” + (6101)? + (bso 2)? i 2 (bron)? 
+ (M, — 61M, — boM2— --- — b,M, — c)* 
+ 2bibsaider ie + 2bibsoi03r 13 se 2b,b jo;0 ji; 
5 id a 2bn—10n0(n—VO nT n—n — 2e0(bioir01 + 


beoe’on + +s + bao.Ton))” (13) 
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Since most predictions are made using only one, two or three inde- 
pendent variables, the standard error of estimate for each is given 
below. In the case of one independent variable, 


Y = bX; +c 
M, = b.M, +c 
oy = bir 


and substituting in Equation 6 
o = Va? + (bi1)? + (Mo — 1M — c)? — 2bioosro,~— (14) 
In the case of two independent variables, 
Y = 0:X%1+ bX2+¢ 
M, = 6.M,+ }.M2+ ¢ 
o = V (bi)? + (b202)? + Qibeoicerie 
Substituting in Equation 5 
= V 60" + (b101)? + (beo2)? + (Mo = biM, = boM> = o)* 
+ 2bibeoi erie — 26 o(b:0 1701 + bear oe) (15) 
In the case of three independent variables, 
Y = b,X, + b2X2+ bsX3 +c 
M, = b.M,+ 6).M.+6:;M;+ c 
Cy = V (b101)? + (bea2)? + (b303) + 2bibeo erie 
+ 2bibsoiosris + 2bebso203re3 














Substituting the above in Equation 6 
og = [oo? + (by01)? + (beo2)? + (b303)? + 
(Mo — 61M, — b2M2 — 63M; — c)* + 2bybeoieerie + 

Qbibsoiosris + Websooesr23 — Bwo(bioiror + beoeroe + bsosros)]* (16) 

In Equations 14, 15 and 16 above, the standard deviations, means 
and correlation coefficients are those derived from the data of the 
population for which the prediction is being made. 

This paper has presented a method of measuring the validity of 
predicted scores. After having devised a means of predicting scores 
the equations obtained should be tried with a second population 
selected in the same way as the first. ~Correlating the predicted 
criterion scores with the actual criterion scores in the second population 
does not account for any “‘constant”’ errors due to differences in means, 
standard deviations and correlations in the first and second popula- 
tions. Hence, as a measure of the validity of the predicted scores 
it is recommended that a standard error of estimate be used. Such 
a mean square error of estimate must be one taken about the predicted 
scores as an origin rather than the regression of the particular correla- 
tion concerned if the measure is to account for all discrepancies of 


prediction. Such a formula is given in a form that makes unnecessary | 


the actual computation of a predicted score for each case concerned. 
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Some College Students and Their Problems, by Luella C. Presesy. Ohio 
State University Press, 1929. Pp. VI + 97. 


The book is one of case studies of ‘‘normal”’ students from several 
institutions who, because of various unfortunate circumstances got 
into difficulty. The points exemplified and problems to which they 
give rise are: (1) Poor note taking; (2) no system of planning life 
and study; (3) star high school athlete with about sixth grade subject- 
matter achievement; (4) ignorance of grammar causing difficulty 
with foreign languages; (5) use of indefinite pronouns; (6) infantile 
reading habits; (7) attempt to work fourteen hours a day outside of 
of studies; (8) physical tiredness; (9) constipation; (10) infected tonsils, 
adenoids, appendix and sinus; (11) hypothyroid; (12) overweight and 
unloved; (13) mother dependence; (14) childishness throughout 
adult life; (15) tyrannical and nagging father; (16) slow member of 
a bright family; (17) conflicting standards and lack of security in 
homes of divorced and re-married parents; (18) parental quarrels; (19) 
social isolation and overwork; (20) disappointment in daydreams of 
social success at college; (21) race handicap, overwork, unimpressive 
appearance, speech defect, inferiority feeling; (22) overcompensation 
for rural background; (23) imitating a brighter friend in laziness and 
frivolity; (24) boy-crazy girl from segregated background; (25) too 
many activities; (26) fundamentalist and puritanical standards; 
(27) attempt by mother to impose standards of her generation; (28) 
intense love with marriage remote; (29) fatherless boy, delinquent 
and criminal attitudes and practices, many physical defects; (30) 
fears connected with sex; (31) sent to college against his wishes, to 
help satisfy parental ambitions; (32) low intelligence, unhappy home, 
cheating to remain in college; (33) drinking to drown disappointment 
in missing fraternity life because of race; (34) no vocational plans or 
preparation; (35) no openings in work for which prepared; (36) low 
intelligence, poor preparation, driven by ambition to study law at 
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age forty; (37) overweight, crippled, and stage struck; (38) sentimental 
rather than realistic idea of nursing; (39) technical skill but poor 
general ability; (40) required subjects uninteresting and irrelevant. 
The cases are well written and are followed by questions designed 
to help the undergraduate make practical application to himself. 
The attempt to illustrate one feature in each case has, of course, 
resulted in oversimplifying the complex interactions and sometimes 
obseure causes of inadequate behavior. Goopwin WATSON. 
Teachers College, Columbia University. 





Introduction to Abnormal Psychology, by V. E. Fisher. The Mac- 
millan Co., New York, 1929. Pp. X + 512. 


‘An Introduction to Abnormal Psychology,” written as an 
elementary text, fulfills its purpose admirably. Professor Fisher has 
succeeded in effecting a union between interest and fact, an accomplish- 
ment which will be appreciated especially by college students. The 
book is neither arid pedantry nor popularized pseudo-science, but a 
simple, clear-cut, and at times fascinating presentation of fundamental 
information concerning mental diseases and subnormal intellects. 

The author says in his Preface that he has ‘‘ placed considerable 
emphasis upon two points of view. The first of these is the assump- 
tion that mental abnormality is to be regarded as a purely relative 
matter . . . The second is that mental abnormalities can be most 
adequately understood and dealt with when viewed as disorders of 
the personality.”” After stating his lines of attack, Professor Fisher 
really carries them out. His recurrent emphasis upon the fact that 
the so-called insane differ only in degree from the normal—and that 
degree very often infinitesimally slight—will go a long way toward 
dispelling from the minds of students the popular conception of what 
constitutes a so-called ‘‘mental case.””’ Growing out of his second 
thesis is the statement that ‘‘we must always look upon any functional 
mental disorder, in the light of our present knowledge, as the result 
of an interplay of hereditary and environmental factors.”’ 

This moderate point of view is an indication of the spirit of the 
whole book. The author approaches his work with poise and broad 
understanding, and at no place allows himself to be led to extremes. 
He makes it perfectly clear that he is not a psycho-analyst, yet he 
is not averse to accepting many of the undoubtedly valuable theories 


of Freud, Jung, and Adler. Conflicts with respect to sex, for example, ’ 
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are assigned an important place among the causes of mental disorders, 
but do not become all-inclusive. Possibly Professor Fisher is 
too greatly influenced by McDougall, but McDougall’s work must 
necessarily receive a comparatively large place in any survey of abnor- 
mal psychology. 

In a book of over 500 pages, the author devoted but 39 to feeble- 
mindedness, which seems rather scant attention to so important a 
phase of abnormal psychology. Coming as it does in the last chapter, 
this discussion brings the reader up rather abruptly with a feeling of 
incompleteness. But this is a minor criticism which can be immedia- 
ately offset by the remark that the Glossary is excellent. 


HERBERT A. CARROLL. 
Teachers College, Columbia University. 





Exercise Manual in Statistics, by Karl John Holzinger and Blythe 
Clayton Mitchell. Boston: Ginn and Co., 1929. Pp. 160. 


According to the authors, this manual is intended for use with 
any of the standard textbooks in educational statistics. It could 
easily be so used, but it has obviously been written more especially 
to accompany Holzinger’s ‘‘ Statistical Methods for Students of Educa- 
tion,’”’ which it follows closely in nomenclature and in the general 
list of topics treated. The manual is well gotten up, contains many 
problems with answers worked out, which should prove valuable 
when used to supplement the problems given in textbooks. Students 
regularly and often ask for more practice material and this little book 
in a large degree fulfills this felt need. 

Some of the topics covered under the chapter on the normal curve, 
several of the more advanced methods of correlation including Spear- 
man’s tetrad-differences, and methods of fitting curve will, in the 
reviewer’s opinion, have to be omitted in a first year course. Prob- 
ably most teachers will count themselves lucky if they are able to 
train even graduate students to use with understanding the more 
elementary and fundamental techniques. This is, of course, no 
criticism of the book but rather of the preparation of the large bulk 
of graduate students in psychology and education for statistical work. 
It is certainly a good idea to include in a manual many methods which 
will be grasped by a relatively small per cent of graduate students, 
since these, after all, will be the ones to employ statistics in research. 

H. E. Garrett. 


Columbia University. 
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Fundamentals of Educational Psychology, by Ira Morris Gast and 
Harley Clay Skinner, Benj. H. Sanborn and Co., 1929. Pp. 
XIV + 354. 


The announced purpose of this book is to bring the subject nearer 
to the classroom, and to correct the overemphasis and misinterpreta- 
tion of intelligence tests, endocrinology, heredity, and behavioristic 
psychology. 

This is a fair statement of the viewpoint of the authors. Their 
attitude toward behaviorism is clear from their analysis of human 
behavior into heredity, environment, and will, which is regarded as 
having a limited freedom. ‘The book throughout stresses environ- 
ment. Differences in memory are attributed to learning habits and 
attention (p. 197). ‘‘General mental ability (intelligence) is based 
on the fact of transfer’? (p. 235). It may be said that the book is 
not at all unfavorable to the theory of transfer. Finally there may 
be added the following estimate of intelligence testing: ‘‘When we 
sort pupils according to intelligence quotient we are probably sorting 
them as much on the basis of physical fitness as on that of intelligence”’ 
(p. 57). 

In regard to the purpose of bringing the subject nearer to the class- 
room, it can be stated that the authors have clearly attempted to 
produce a text rather than a scientific treatise. At the end of every 
chapter are study questions,—theoretically so useful; actually so 
little used. Many passages of the book are of an inspirational charac- 
ter. ‘‘The greatest Teacher rarely used a ‘don’t.’ His rules for 
living were couched in such terms as, do, look, seek, go, and act” 
(p. 296). ‘‘What boy will not be a better boy because he has read 
the story of Lindbergh?” (p. 15). There are almost no diagrams’ or 
tables in the book; in their place appear about twenty photographs 
of prominent psychologists. | 

The book is not free from careless statements. ‘‘The complexity 
of the reaction patterns vary considerably” (p.-276). ‘‘The original 
nature of an individual is a product of his biological inheritance, pre- 
natal environment, and maturation” (p. 1). But on p. 21: ‘Nurture 
begins nine months before the birth of the child.”” On p. 294 there is 
a paragraph headed “‘Glands and the Emotions.” In fairness to 
the authors the major portion of it is quoted: ‘‘Cannon, Crile, and 
other investigators have demonstrated that the emotions have a 


marked effect on the glands, calling forth secretions from some glands 
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and inhibiting secretions in others. This is especially noticeable 
in the case of the thyroid glands where increased secretion produces 
greater susceptibility to emotions, especially anger, laughter, or hate. 
Adrenalin, the active principle of these glands, is a very powerful 
aug...” 

The reviewer was not able to discover the plan of the book. Sev- 
eral chapters appear to be out of their proper places. Learning and 
memory are treated in widely separated portions of the book, a fact 
which suggests the classifications of an earlier day. In short, in spite 
of many references to recent experimentation, the main substance of 
the book seems to belong to the psychology of some years ago. 


MELVIN Riaa. 
Kenyon College. 





The Thinking Machine; by C. J. Herrick, Chicago: The University 
of Chicago Press, 1929. Pp. 347. 


Professor Herrick endeavors to interpret all of human activity in 
mechanistic terms, but his concept of a machine is not the limited 
traditional one. He rejects the implication of machine products 
being simply ‘‘the passive product of external force’’ and defines 
machines as ‘‘any contrivances to do some kind of work and deliver 
some kind of product.’”’ This transformation of energy can be ulti- 
mately stated in laws. For Professor Herrick, to be ‘‘mechanistic”’ 
is essentially the basis for being ‘‘scientific.’’ Further in his interpre- 
tation, ‘‘the outstanding feature of all mechanisms is control of the 
course of events that go on within them.” 

He proceeds first to survey the field of biology and show that the 
mechanistic approach is valid here. Since the development of the 
various functions of the body has been a regular one from the stand- 
point of evolution and since.recent evidence supports the unity of the 
organism, this mechanistic approach is valid in the interpretation of 
so-called mental states. ‘‘All life, including the mental or spiritual 
life, is a mechanistic process.”’ 

The efficacy of Herrick’s approach is open to question. His real 
desire is to show that so-called mental and spiritual problems are after 
all fundamentally like the problems of the action of the digestive system 
or the muscles and as rapidly as possible should be subjected to the 
scientific approach. There is no attempt to claim any great amount of 
scientific knowledge about bodily functions. The point is that the 
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scientific approach is valid and that no aspect of human experience 
should be excluded from such an approach. ‘‘The scientific method 
is only one way to enlarge and rectify our experience. It is a safe way 
and a satisfying way. The claim here made for it is that it will take 
us farther into the domain of the spiritual life than some of us have 
supposed.”’ 

Many individuals of course are in complete agreement and need 
no argument. Those who do not agree are not likely to be convinced 
by an attempt to compare man with a machine—even when the 
machine is shown in the favorable position of this volume. The 
difficulty is deep-seated and the concepts involved are poorly defined. 
To the present writer, it would seem that Herrick did a better service 
in ‘‘Fatalism or Freedom.” 

Analogies have their values but just as certainly do they have 
limitations and their limitations become more important whenever 
the concepts attached by different people to the analogy are not 
clearly defined, as in this case. The analogy is merely an initial 
device which may prove helpful to some people. To make it too 
prominent is to limit thinking. 

The book is well written in a language which is understandable 
to others than specialists and has the advantage of an index. The 
definition of psychology will not be helpful in clearing up the question 
of the domain of psychology but the summary of the development 
and integration of various bodily functions (Part II and III) is 
excellent. RaupH B. SPENCE. 

Teachers College, Columbia University. 





Vocational Psychology and Character Analysis, by H. L. Hollingworth. 
New York: D. Appleton and Co., 1929. Pp. X + 409. 


Neither a handbook of vocational guidance, nor a manual of per- 
sonnel procedures, Professor Hollingworth’s latest work represents 
the aspect of research rather than that of practice. It is intended as 
a general survey of a field of psychological investigation which, during 
the last quarter-century, has grown so vast that special texts are 
written concerning aspects of it alone. Good general texts of the 
nature of the present volume, published during the last five 
years, are rare. To have one presented by so able a thinker and 
so skilled a teacher as professor Hollingworth is indeed to be 
appreciated. 
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Its outstanding characteristic is probably its moderation. Tradi- 
tional methods of vocational selection are not cast away simply because 
they are old; rather is it the expressed aim of this work ‘‘to investigate, 
by controlled experiment, the degree of value attaching to these 
methods and the conditions under which greater value can be secured.” 
One might nevertheless possibly enquire whether there is not also 
some virtue in a more radical excision of dead wood. 

Disposing briefly of such antecedents of vocational psychology as 
phrenology and physiognomy, Professor Hollingworth turns to a 
study of traditional methods, including letters of application, recom- 
mendations, photographs, self-analyeis, and the personal interview. 
The main criticisms which he has to pass in these chapters refer to 
unreliability, lack of objectivity, and the need for supplementary 
information. Suggesting the use, wherever possible, of the method 
of measurement instead of that of personal report, he proceeds to a 
study of objective methods of identifying and measuring human 
traits. Psychological tests, both general and special; tests of per- 
sonality traits; the measurement of interests; the value of scholastic 
records; the degree of correspondence between personal psychograph 
and job-analysis; all receive brief yet comprehensive treatment. 
Statistical methods scarcely obtrude upon the reader’s attention; but, 
when dealt with, are handled clearly and concisely. Two final chap- 
ters, one on ‘‘Intelligence and Vocational Aptitude.” and the other on 
‘‘Vocational Aptitudes in Women’’—the latter constituting probably 
one of the most significant chapters in the book—bring to a close an 
interesting, scholarly, up-to-date survey of vocational psychology. 

This book is presumably not intended as an elementary text; 
it is written for the intelligent reader. Nevertheless the appendices 
include a series of useful class exercises. There is but little illustrative 
material, but it is all entirely relevant. If there is any adverse 
criticism to pass, it is that even the little time devoted to the antece- 
dents of vocational psychology was too much. That ground has 
already been thoroughly covered in a recent text by Hull. Moreover, 
the discussion of such material would seem to be a waste of time on a 
method of vocational selection used only by the most backward of 
employing agencies. One might ask whether possibly improvement 
might not have been effected had the same amount of effort been 
devoted to suggestions concerning the lines along which further 
research should be conducted. O. L. Harvey. 

University of Pittsburgh. 
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